pere's picture
Saving weights and logs of step 9000 - epoch 0
4286454
|
raw
history blame
2.54 kB
metadata
language:
  - 'no'
license: apache-2.0
tags:
  - audio
  - asr
  - automatic-speech-recognition
  - hf-asr-leaderboard
model-index:
  - name: scream_sextusdecimus_virtual_tsfix_small
    results: []

scream_sextusdecimus_virtual_tsfix_small

This model is a fine-tuned version of openai/whisper-small on the NbAiLab/ncc_speech dataset.

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • lr_scheduler_type: linear
  • per_device_train_batch_size: 32
  • total_train_batch_size_per_node: 128
  • total_train_batch_size: 1024
  • total_optimization_steps: 20,000
  • starting_optimization_step: None
  • finishing_optimization_step: 20,000
  • num_train_dataset_workers: 32
  • num_hosts: 8
  • total_num_training_examples: 20,480,000
  • steps_per_epoch: To be computed after first epoch
  • num_beams: 5
  • dropout: True
  • bpe_dropout_probability: 0.1
  • activation_dropout_probability: 0.1

Training results

step eval_loss train_loss eval_wer eval_cer eval_exact_wer eval_exact_cer
0 1.2807 3.0725 196.6092 157.4275 196.6092 157.4275
1000 0.5902 1.0592 15.1695 4.8382 15.1695 4.8382
2000 0.4240 0.8640 11.3623 3.9308 11.3623 3.9308
3000 0.4213 0.7930 9.4587 3.3537 9.4587 3.3537
4000 0.4353 0.7986 9.3694 3.5263 9.3694 3.5263
5000 0.4697 0.7580 9.7858 4.1478 9.7858 4.1478
6000 0.4535 0.7003 10.0238 4.2119 10.0238 4.2119
7000 0.4608 0.7296 8.8638 3.4228 8.8638 3.4228
8000 0.3902 0.7053 8.9233 3.6003 8.9233 3.6003
9000 0.3575 0.7124 9.3992 3.9702 9.3992 3.9702

Framework versions

  • Transformers 4.30.0.dev0
  • Datasets 2.12.1.dev0
  • Tokenizers 0.13.3