Whisper Small — Fine-tuned on Conversational Hindi

Fine-tuned OpenAI Whisper-small on real conversational Hindi audio.

Results

Model WER
Whisper-small baseline 178.08%
Fine-tuned (checkpoint-1200) 12.25%
After text normalisation 10.57%

Training Data

  • 104 real speakers, long-form conversational recordings
  • 4,929 segments after preprocessing (1s–30s filter)
  • 90/10 train/test split

Novel Contributions

  • Lattice-based WER evaluation — handles Hindi filler variants (हाँ/हां/हम्म)
  • Hindi number normalisation pipeline
  • Roman-script English word detection for Hinglish audio

Training Config

  • Base model: openai/whisper-small
  • Steps: 2000 | Best checkpoint: 1200
  • Batch size: 16 | LR: 1e-5

How to Use

from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="YashikaBasapure/whisper-small-hindi-conversational") result = pipe("your_hindi_audio.wav") print(result["text"])

GitHub

https://github.com/Shishimanu9/Hindi-ASR

Downloads last month
29
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support