whisper-small-khmer-v2

This model is a fine-tuned version of openai/whisper-small on the openslr, google/fleurs and km-speech-corpus dataset. It achieves the following results on the evaluation set:

Loss: 0.26
Wer: 0.6165

Model description

This model is fine-tuned with Google FLEURS, OpenSLR (SLR42) and km-speech-corpus dataset.

from transformers import pipeline

pipe = pipeline(
    task="automatic-speech-recognition",
    model="seanghay/whisper-small-khmer-v2",
)

result = pipe("audio.wav",
  generate_kwargs={
    "language":"<|km|>",
    "task":"transcribe"},
    batch_size=16
)

print(result["text"])

Downloads last month: 29

Inference Examples

Automatic Speech Recognition

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train seanghay/whisper-small-khmer-v2

Spaces using seanghay/whisper-small-khmer-v2 2

Evaluation results

Wer on Google FLEURS
test set self-reported

0.617

View on Papers With Code