Khmer Automation Speech Recognition
Collection
This project contributes to ASR modeling through a collection of models.
β’
9 items
β’
Updated
PhanithLIM/whisper-small-aug-28-april-lightning-v1
is a fine-tuned version of OpenAI's Whisper ASR model adapted specifically for the Khmer language. Built on the small variant of Whisper and optimized using FasterWhisper, this model provides efficient and accurate speech-to-text transcription for Khmer audio.
pip install faster-whisper
from faster_whisper import WhisperModel
# Load the model
model = WhisperModel("PhanithLIM/whisper-small-khmer-ct2", compute_type="int8", local_files_only=False, beam_size=5)
# Transcribe Khmer audio
segments, info = model.transcribe("your_audio_file.wav")
# Print segments
for segment in segments:
print(f"{segment.start:.2f}s --> {segment.end:.2f}s: {segment.text}")
This model can be integrated into real-time systems using tools such as:
CTranslate2 is a fast inference engine for transformer models, optimized for CPU and GPU deployment, especially in production environments. It's developed by the team behind OpenNMT, and it's widely used in speech and machine translation systems, including FasterWhisper, which is a CTranslate2 port of OpenAIβs Whisper.
Base model
openai/whisper-small