Whisper Small — Fine-tuned on Conversational Hindi
Fine-tuned OpenAI Whisper-small on real conversational Hindi audio.
Results
| Model | WER |
|---|---|
| Whisper-small baseline | 178.08% |
| Fine-tuned (checkpoint-1200) | 12.25% |
| After text normalisation | 10.57% |
Training Data
- 104 real speakers, long-form conversational recordings
- 4,929 segments after preprocessing (1s–30s filter)
- 90/10 train/test split
Novel Contributions
- Lattice-based WER evaluation — handles Hindi filler variants (हाँ/हां/हम्म)
- Hindi number normalisation pipeline
- Roman-script English word detection for Hinglish audio
Training Config
- Base model: openai/whisper-small
- Steps: 2000 | Best checkpoint: 1200
- Batch size: 16 | LR: 1e-5
How to Use
from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="YashikaBasapure/whisper-small-hindi-conversational") result = pipe("your_hindi_audio.wav") print(result["text"])
GitHub
- Downloads last month
- 29