whisper_large_v2_thi_dataset_phase3
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.4115
- Cer: 10.8763
- Wer: 20.7305
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 15
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Cer | Validation Loss | Wer |
|---|---|---|---|---|---|
| 0.7544 | 0.9998 | 3132 | 17.4841 | 0.4112 | 30.6117 |
| 0.4151 | 2.0 | 6265 | 20.9872 | 0.3786 | 35.5513 |
| 0.3044 | 2.9995 | 9396 | 15.1611 | 0.3682 | 26.8405 |
| 0.3174 | 3.9998 | 12528 | 16.9391 | 0.3927 | 29.9976 |
| 0.2516 | 5.0 | 15661 | 14.2234 | 0.3991 | 25.0395 |
| 0.1993 | 5.9998 | 18793 | 12.4403 | 0.3980 | 22.7472 |
| 0.1597 | 7.0 | 21926 | 12.0030 | 0.4012 | 22.7070 |
| 0.1287 | 7.9998 | 25058 | 11.3252 | 0.4060 | 21.4210 |
| 0.1068 | 9.0 | 28191 | 11.0928 | 0.4107 | 21.1000 |
| 0.0916 | 9.9989 | 31320 | 10.8763 | 0.4115 | 20.7305 |
| 0.1047 | 10.9998 | 34452 | 0.4228 | 11.4684 | 21.5799 |
| 0.0896 | 12.0 | 37585 | 0.4303 | 10.9857 | 20.9462 |
| 0.076 | 12.9998 | 40717 | 0.4300 | 11.0458 | 20.9101 |
| 0.0651 | 14.0 | 43850 | 0.4352 | 10.8999 | 20.6779 |
| 0.0578 | 14.9992 | 46980 | 0.4382 | 10.9414 | 20.7904 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.1.2+cu118
- Datasets 2.19.0
- Tokenizers 0.19.1
- Downloads last month
- 3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support