--- license: gpl-2.0 base_model: openai/whisper-small tags: - generated_from_trainer metrics: - wer model-index: - name: whisper-small-ug results: [] datasets: - mozilla-foundation/common_voice_15_0 pipeline_tag: automatic-speech-recognition language: - ug --- # whisper-small-ug This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the None dataset. The model is trained on transcripts written in Uyghur Latin Script via utilising Uzbek Tokeniser , as Uyghur Tokeniser is not included in Whisper. Therefore, the output of the model is in Uyghur Latin Script. To convert the output to the Uyghur Arabic Script, you can use the Uyghur script converter: https://github.com/neouyghur/ScriptConverter4Uyghur or you can use online script converter: https://www.yulghun.com/imla/convert.html It achieves the following results on the evaluation set: - Loss: 0.3563 - Wer: 26.8793 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 16 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 500 - training_steps: 4000 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | Wer | |:-------------:|:-----:|:----:|:---------------:|:-------:| | 0.2677 | 1.43 | 1000 | 0.4063 | 34.1157 | | 0.1035 | 2.85 | 2000 | 0.3375 | 29.2183 | | 0.0226 | 4.28 | 3000 | 0.3472 | 27.5155 | | 0.0073 | 5.71 | 4000 | 0.3563 | 26.8793 | ### Framework versions - Transformers 4.35.2 - Pytorch 2.1.1+cu121 - Datasets 2.15.0 - Tokenizers 0.15.0