whisper-tiny-finetune-hindi-fleurs

This model is a fine-tuned version of openai/whisper-tiny on the google/fleurs dataset. It achieves the following results on the evaluation set:

Loss: 0.8315
Wer Ortho: 0.4313
Wer: 0.4262

A working Hugging Face Space can be found here

Model description

This model is a fine-tuned version of openai/whisper-tiny on the google/fleurs dataset. It improves the WER from 102.3 as stated in the Whisper Paper to 0.42 on the Hindi Subset of google/fleurs

Intended uses & limitations

This model is intended to be used on Edge Low Compute Devices such as the Raspbery Pi Pico/3/3B/4 and offers real time transcription of Hindi audio into the English Lexicon.

Training and evaluation data

The model was trained on google/fleurs's hi_in Subset and used WER as the evaluation criteria

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_steps: 50
training_steps: 500
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer Ortho	Wer
1.8112	1.39	100	1.7274	0.6323	0.6258
1.0387	2.78	200	1.1194	0.5130	0.5072
0.7671	4.17	300	0.9671	0.4665	0.4613
0.5283	5.56	400	0.8840	0.4494	0.4440
0.4458	6.94	500	0.8315	0.4313	0.4262

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu121
Datasets 2.16.0
Tokenizers 0.15.0

Citations

@inproceedings{Bhat:2014:ISS:2824864.2824872,
 author = {Bhat, Irshad Ahmad and Mujadia, Vandan and Tammewar, Aniruddha and Bhat, Riyaz Ahmad and Shrivastava, Manish},
 title = {IIIT-H System Submission for FIRE2014 Shared Task on Transliterated Search},
 booktitle = {Proceedings of the Forum for Information Retrieval Evaluation},
 series = {FIRE '14},
 year = {2015},
 isbn = {978-1-4503-3755-7},
 location = {Bangalore, India},
 pages = {48--53},
 numpages = {6},
 url = {http://doi.acm.org/10.1145/2824864.2824872},
 doi = {10.1145/2824864.2824872},
 acmid = {2824872},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {Information Retrieval, Language Identification, Language Modeling, Perplexity, Transliteration},
}

@misc{radford2022whisper,
  doi = {10.48550/ARXIV.2212.04356},
  url = {https://arxiv.org/abs/2212.04356},
  author = {Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
  title = {Robust Speech Recognition via Large-Scale Weak Supervision},
  publisher = {arXiv},
  year = {2022},
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Aryan-401
/

whisper-tiny-finetune-hindi-fleurs

whisper-tiny-finetune-hindi-fleurs

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Citations

Model tree for Aryan-401/whisper-tiny-finetune-hindi-fleurs

Dataset used to train Aryan-401/whisper-tiny-finetune-hindi-fleurs

Space using Aryan-401/whisper-tiny-finetune-hindi-fleurs 1

Evaluation results