sarpba's picture
Update README.md
925b5e0 verified
metadata
language:
  - hu
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Whisper large-v2 CV18 Hu
    results: []
datasets:
  - fsicoli/common_voice_18_0
  - google/fleurs
pipeline_tag: automatic-speech-recognition

Whisper large-v2 CV18 Hu

This model is a fine-tuned version of [openai/whisper-large-v2](https://huggingface.co/openai/whispe r-large-v2) on the fsicoli/common_voice_18_0 dataset. It achieves the following results on the evaluation google/fleurs set:

  • Loss: 0.3493
  • Wer Ortho: 21.9936
  • Wer: 16.0057

Összesített Metrikák

google/fleurs_hu_hu_test:

  • Átlagos WER: 21.75%
  • Átlagos CER: 6.10%
  • Átlagos Normalizált WER: 14.73%
  • Átlagos Normalizált CER: 4.73%

common_voice_17_0_hu_test (it's a fals test (test split was in training)):

  • Átlagos WER: 1.16%
  • Átlagos CER: 0.22%
  • Átlagos Normalizált WER: 0.79%
  • Átlagos Normalizált CER: 0.16%

Kvantált modellek eredményei:

Model WER CER Normalized_WER Normalized_CER Database Split Runtime
int8_bfloat16 21.49 5.93 16.04 6.21 google/fleurs test 550.18
bfloat16 21.33 5.87 15.91 6.15 google/fleurs test 593.96
int8 21.01 5.63 15.38 5.88 google/fleurs test 668.91
int8_float32 21.01 5.63 15.38 5.88 google/fleurs test 669.81
int8_float16 20.96 5.65 15.31 5.91 google/fleurs test 570.11
float16 20.92 5.64 15.24 5.9 google/fleurs test 589.29

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 250
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.1543 0.1 500 0.3619 25.7695 21.5802
0.1336 0.2 1000 0.3661 26.4197 21.9212
0.1358 0.3 1500 0.3516 25.4414 20.7548
0.1165 0.39 2000 0.3431 25.3937 20.3601
0.0959 0.49 2500 0.3581 26.6345 20.4438
0.1045 0.59 3000 0.3427 25.9127 19.9653
0.099 0.69 3500 0.3380 25.3937 19.6902
0.1034 0.79 4000 0.3412 24.5765 19.0083
0.0919 0.89 4500 0.3370 25.0119 19.3672
0.077 0.99 5000 0.3295 24.5884 19.3433
0.0447 1.09 5500 0.3405 23.6220 17.5668
0.0435 1.18 6000 0.3364 23.2999 17.4353
0.0383 1.28 6500 0.3370 22.9957 17.4831
0.0388 1.38 7000 0.3391 22.9838 17.1123
0.0436 1.48 7500 0.3345 22.7332 17.6745
0.0466 1.58 8000 0.3327 23.6101 17.3994
0.0357 1.68 8500 0.3477 24.2961 17.8121
0.0417 1.78 9000 0.3259 22.8883 16.7115
0.0383 1.88 9500 0.3206 22.0055 16.5859
0.0381 1.97 10000 0.3425 23.1508 16.8192
0.0153 2.07 10500 0.3461 22.5304 16.9807
0.0158 2.17 11000 0.3467 22.8227 16.7115
0.0228 2.27 11500 0.3439 22.3276 16.4244
0.0231 2.37 12000 0.3581 23.3954 16.6756
0.0171 2.47 12500 0.3537 22.7094 16.4304
0.0188 2.57 13000 0.3503 22.4588 16.8072
0.0157 2.67 13500 0.3518 22.5245 16.3826
0.0154 2.76 14000 0.3534 22.2739 16.0715
0.0205 2.86 14500 0.3479 21.9399 16.0237
0.0164 2.96 15000 0.3493 21.9936 16.0057

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1