PhoWhisper-large-ct2

This repository contains the PhoWhisper-large model converted to use CTranslate2 for faster inference. This allows for significant performance improvements, especially on CPU.

Usage

Installation: Ensure you have the necessary libraries installed:
```
pip install transformers ctranslate2 faster-whisper
```

Conversion (only needed once): This step converts the original Hugging Face model to the CTranslate2 format.

ct2-transformers-converter --model vinai/PhoWhisper-large --output_dir PhoWhisper-large-ct2 --copy_files tokenizer_config.json --quantization float16

Transcription:

import os
from faster_whisper import WhisperModel

model_size = "kiendt/PhoWhisper-large-ct2"
# Run on GPU with FP16
#model = WhisperModel(model_size, device="cuda", compute_type="float16")

# or run on GPU with INT8
# model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
# or run on CPU with INT8
model = WhisperModel(model_size, device="cpu", compute_type="int8")

segments, info = model.transcribe("audio.wav", beam_size=5) # Replace audio.wav with your audio file

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

Model Details

Based on the vinai/PhoWhisper-large model.
Converted using ct2-transformers-converter.
Optimized for faster inference with CTranslate2.

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

License

MIT

kiendt
/

PhoWhisper-large-ct2

PhoWhisper-large-ct2

Usage

Model Details

Contributing

License

Model tree for kiendt/PhoWhisper-large-ct2