metadata
language:
- hu
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
- generated_from_trainer
metrics:
- wer
model-index:
- name: Whisper large-v2 CV18 Hu
results: []
datasets:
- fsicoli/common_voice_18_0
- google/fleurs
pipeline_tag: automatic-speech-recognition
Whisper large-v2 CV18 Hu
This model is a fine-tuned version of [openai/whisper-large-v2](https://huggingface.co/openai/whispe r-large-v2) on the fsicoli/common_voice_18_0 dataset. It achieves the following results on the evaluation google/fleurs set:
- Loss: 0.3493
- Wer Ortho: 21.9936
- Wer: 16.0057
Összesített Metrikák
google/fleurs_hu_hu_test:
- Átlagos WER: 21.75%
- Átlagos CER: 6.10%
- Átlagos Normalizált WER: 14.73%
- Átlagos Normalizált CER: 4.73%
common_voice_17_0_hu_test (it's a fals test (test split was in training)):
- Átlagos WER: 1.16%
- Átlagos CER: 0.22%
- Átlagos Normalizált WER: 0.79%
- Átlagos Normalizált CER: 0.16%
Kvantált modellek eredményei:
Model | WER | CER | Normalized_WER | Normalized_CER | Database | Split | Runtime |
---|---|---|---|---|---|---|---|
int8_bfloat16 | 21.49 | 5.93 | 16.04 | 6.21 | google/fleurs | test | 550.18 |
bfloat16 | 21.33 | 5.87 | 15.91 | 6.15 | google/fleurs | test | 593.96 |
int8 | 21.01 | 5.63 | 15.38 | 5.88 | google/fleurs | test | 668.91 |
int8_float32 | 21.01 | 5.63 | 15.38 | 5.88 | google/fleurs | test | 669.81 |
int8_float16 | 20.96 | 5.65 | 15.31 | 5.91 | google/fleurs | test | 570.11 |
float16 | 20.92 | 5.64 | 15.24 | 5.9 | google/fleurs | test | 589.29 |
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 250
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Wer Ortho | Wer |
---|---|---|---|---|---|
0.1543 | 0.1 | 500 | 0.3619 | 25.7695 | 21.5802 |
0.1336 | 0.2 | 1000 | 0.3661 | 26.4197 | 21.9212 |
0.1358 | 0.3 | 1500 | 0.3516 | 25.4414 | 20.7548 |
0.1165 | 0.39 | 2000 | 0.3431 | 25.3937 | 20.3601 |
0.0959 | 0.49 | 2500 | 0.3581 | 26.6345 | 20.4438 |
0.1045 | 0.59 | 3000 | 0.3427 | 25.9127 | 19.9653 |
0.099 | 0.69 | 3500 | 0.3380 | 25.3937 | 19.6902 |
0.1034 | 0.79 | 4000 | 0.3412 | 24.5765 | 19.0083 |
0.0919 | 0.89 | 4500 | 0.3370 | 25.0119 | 19.3672 |
0.077 | 0.99 | 5000 | 0.3295 | 24.5884 | 19.3433 |
0.0447 | 1.09 | 5500 | 0.3405 | 23.6220 | 17.5668 |
0.0435 | 1.18 | 6000 | 0.3364 | 23.2999 | 17.4353 |
0.0383 | 1.28 | 6500 | 0.3370 | 22.9957 | 17.4831 |
0.0388 | 1.38 | 7000 | 0.3391 | 22.9838 | 17.1123 |
0.0436 | 1.48 | 7500 | 0.3345 | 22.7332 | 17.6745 |
0.0466 | 1.58 | 8000 | 0.3327 | 23.6101 | 17.3994 |
0.0357 | 1.68 | 8500 | 0.3477 | 24.2961 | 17.8121 |
0.0417 | 1.78 | 9000 | 0.3259 | 22.8883 | 16.7115 |
0.0383 | 1.88 | 9500 | 0.3206 | 22.0055 | 16.5859 |
0.0381 | 1.97 | 10000 | 0.3425 | 23.1508 | 16.8192 |
0.0153 | 2.07 | 10500 | 0.3461 | 22.5304 | 16.9807 |
0.0158 | 2.17 | 11000 | 0.3467 | 22.8227 | 16.7115 |
0.0228 | 2.27 | 11500 | 0.3439 | 22.3276 | 16.4244 |
0.0231 | 2.37 | 12000 | 0.3581 | 23.3954 | 16.6756 |
0.0171 | 2.47 | 12500 | 0.3537 | 22.7094 | 16.4304 |
0.0188 | 2.57 | 13000 | 0.3503 | 22.4588 | 16.8072 |
0.0157 | 2.67 | 13500 | 0.3518 | 22.5245 | 16.3826 |
0.0154 | 2.76 | 14000 | 0.3534 | 22.2739 | 16.0715 |
0.0205 | 2.86 | 14500 | 0.3479 | 21.9399 | 16.0237 |
0.0164 | 2.96 | 15000 | 0.3493 | 21.9936 | 16.0057 |
Framework versions
- Transformers 4.34.1
- Pytorch 2.3.0+cu121
- Datasets 2.14.6
- Tokenizers 0.14.1