Whisper Small — Fine-tuned on Conversational Hindi

Fine-tuned OpenAI Whisper-small on real conversational Hindi audio.

Results

Model	WER
Whisper-small baseline	178.08%
Fine-tuned (checkpoint-1200)	12.25%
After text normalisation	10.57%

Training Data

104 real speakers, long-form conversational recordings
4,929 segments after preprocessing (1s–30s filter)
90/10 train/test split

Novel Contributions

Lattice-based WER evaluation — handles Hindi filler variants (हाँ/हां/हम्म)
Hindi number normalisation pipeline
Roman-script English word detection for Hinglish audio

Training Config

Base model: openai/whisper-small
Steps: 2000 | Best checkpoint: 1200
Batch size: 16 | LR: 1e-5

How to Use

from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="YashikaBasapure/whisper-small-hindi-conversational") result = pipe("your_hindi_audio.wav") print(result["text"])

GitHub

https://github.com/Shishimanu9/Hindi-ASR

Downloads last month: 29

Safetensors

Model size

0.2B params

Tensor type

F32