whisper-th-large-v3-ct2
CTranslate2 (int8) conversion of biodatlab/whisper-th-large-v3-combined (Thonburian Whisper) for fast CPU/GPU inference with faster-whisper.
Built for and used by LyricBridge β an open-source karaoke maker that removes vocals and generates word-synced Thai lyrics.
- Architecture: Whisper
large-v3(128 mel bins). - Format: CTranslate2, quantized
int8(CPU); useint8_float16on GPU. - Language: Thai (
th).
How to use (faster-whisper)
from faster_whisper import WhisperModel
# Pin the revision so results stay reproducible over time.
model = WhisperModel(
"Avocaduu14/whisper-th-large-v3-ct2",
revision="1a1554ea606d89c937216ada609bb8585e20a36e",
device="cpu", # or "cuda"
compute_type="int8", # CPU; use "int8_float16" on GPU
)
segments, info = model.transcribe("audio.wav", language="th")
for seg in segments:
print(seg.start, seg.end, seg.text)
Source model & attribution
This repository is a format conversion only (CTranslate2 / int8). No weights were retrained β only the storage/compute format changed. All modeling credit belongs to the original authors.
- Original model: biodatlab/whisper-th-large-v3-combined (Thonburian Whisper)
- Authors: Atirut Boribalburephan, Zaw Htet Aung, Knot Pipatsrisawat, Titipat Achakulvisut β Biomedical and Data Lab, Mahidol University
- Base model: openai/whisper-large-v3
- License: Apache-2.0 (same as the source; retained here)
- Reported quality: WER 6.59 on Common Voice 13 (th) test set (from the source model card)
Citation
@misc{thonburian_whisper_med,
author = {Atirut Boribalburephan, Zaw Htet Aung, Knot Pipatsrisawat, Titipat Achakulvisut},
title = {Thonburian Whisper: A fine-tuned Whisper model for Thai automatic speech recognition},
year = {2022},
url = {https://huggingface.co/biodatlab/whisper-th-medium-combined},
doi = {10.57967/hf/0226},
publisher = {Hugging Face}
}
Conversion
Converted with CTranslate2's ct2-transformers-converter (int8). Whisper large-v3
architecture (128 mel bins), so use a runtime that supports large-v3 feature extraction.
Credits
- biodatlab / Thonburian Whisper β the Thai-finetuned model this repo converts.
- OpenAI Whisper β base architecture.
- faster-whisper / CTranslate2 β inference runtime.
- LyricBridge β downstream project (MIT).
- Downloads last month
- 22
Model tree for Avocaduu14/whisper-th-large-v3-ct2
Base model
openai/whisper-large-v3 Finetuned
biodatlab/whisper-th-large-v3-combined