pere's picture
Saving weights and logs of step 19999 - epoch 0
8379474
|
raw
history blame
3.77 kB
metadata
language:
  - 'no'
license: apache-2.0
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: scream_sextusdecimus_virtual_tsfix_medium_1e5
    results: []

scream_sextusdecimus_virtual_tsfix_medium_1e5

This model is a fine-tuned version of openai/whisper-medium on the NbAiLab/ncc_speech dataset. It achieves the following results on the evaluation set:

  • step: 19999
  • eval_loss: 1.6336
  • train_loss: 0.6795
  • eval_wer: 7.9120
  • eval_cer: 3.4474
  • eval_exact_wer: 7.9120
  • eval_exact_cer: 3.4474

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • lr_scheduler_type: linear
  • per_device_train_batch_size: 16
  • total_train_batch_size_per_node: 64
  • total_train_batch_size: 512
  • total_optimization_steps: 20,000
  • starting_optimization_step: None
  • finishing_optimization_step: 20,000
  • num_train_dataset_workers: 32
  • num_hosts: 8
  • total_num_training_examples: 10,240,000
  • steps_per_epoch: To be computed after first epoch
  • num_beams: None
  • dropout: True
  • bpe_dropout_probability: 0.1
  • activation_dropout_probability: 0.1

Training results

step eval_loss train_loss eval_wer eval_cer eval_exact_wer eval_exact_cer
0 5.5890 2.8362 17.4598 5.3906 17.4598 5.3906
1000 5.2798 1.0896 12.4926 3.8321 12.4926 3.8321
2000 5.2432 0.9018 11.0351 3.9899 11.0351 3.9899
3000 4.1719 0.8159 9.8453 3.8173 9.8453 3.8173
4000 3.0758 0.7799 9.6371 3.8716 9.6371 3.8716
5000 2.2223 0.7803 9.7264 3.9110 9.7264 3.9110
6000 2.0574 0.7206 9.5181 3.8864 9.5181 3.8864
7000 1.7271 0.7088 8.7745 3.7039 8.7745 3.7039
8000 1.5868 0.7528 8.2391 3.5362 8.2391 3.5362
9000 1.5781 0.6747 8.2094 3.5313 8.2094 3.5313
10000 1.6658 0.6830 8.1499 3.4277 8.1499 3.4277
11000 1.5514 0.7141 8.6853 3.8814 8.6853 3.8814
12000 1.8042 0.6941 8.5366 3.6792 8.5366 3.6792
13000 1.7561 0.6732 8.6258 3.8666 8.6258 3.8666
14000 1.7517 0.7050 8.2094 3.5066 8.2094 3.5066
15000 1.7413 0.7191 7.8822 3.3389 7.8822 3.3389
16000 1.7014 0.6850 8.0309 3.4178 8.0309 3.4178
17000 1.7205 0.6937 7.8822 3.4524 7.8822 3.4524
18000 1.5928 0.7014 7.8227 3.4425 7.8227 3.4425
19000 1.5883 0.7102 7.9417 3.4573 7.9417 3.4573
19999 1.6336 0.6795 7.9120 3.4474 7.9120 3.4474

Framework versions

  • Transformers 4.30.0.dev0
  • Datasets 2.12.1.dev0
  • Tokenizers 0.13.3