Edit model card

whisper-large-cit-do0-wd0

This model is a fine-tuned version of openai/whisper-large-v3 on the SF 200 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6895
  • Wer: 34.0961

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.1267 0.8889 10 1.1143 48.9703
1.0863 1.7778 20 1.0078 40.7323
0.9336 2.6667 30 0.8691 38.9016
0.7543 3.5556 40 0.7925 34.0961
0.7023 4.4444 50 0.7212 35.0114
0.6007 5.3333 60 0.6558 32.9519
0.5085 6.2222 70 0.6167 31.3501
0.4119 7.1111 80 0.5898 33.1808
0.3749 8.0 90 0.5723 32.9519
0.2971 8.8889 100 0.5698 33.1808
0.2621 9.7778 110 0.5747 32.7231
0.2108 10.6667 120 0.5854 31.8078
0.1793 11.5556 130 0.5977 32.4943
0.1488 12.4444 140 0.6118 31.3501
0.1199 13.3333 150 0.6255 33.4096
0.1135 14.2222 160 0.6416 34.7826
0.097 15.1111 170 0.6606 34.5538
0.0823 16.0 180 0.6738 33.4096
0.0767 16.8889 190 0.6860 33.4096
0.0713 17.7778 200 0.6895 34.0961

Framework versions

  • Transformers 4.41.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
1.61B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Makkoen/whisper-large-cit-do0-wd0-lr5

Finetuned
(274)
this model