Edit model card

Whisper-small-ru-v2

This model is a fine-tuned version of openai/whisper-small on an Russian part of the Common Voice 15 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1329
  • Wer: 12.6750
  • Cer: 3.7305
  • Learning Rate: 0.0000

Model description

Same as openai/whisper-small.

Intended uses & limitations

Same as openai/whisper-small

Training and evaluation data

Fine-tunned on an Russian part of the Common Voice 15 dataset.

Training procedure

According to the article "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers"

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-08
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 250
  • training_steps: 15000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer Rate
0.0661 0.09 500 0.1358 12.9097 3.8217 0.0000
0.0616 0.17 1000 0.1357 12.9620 3.8949 0.0000
0.0601 0.26 1500 0.1357 12.8795 3.8225 0.0000
0.0666 0.35 2000 0.1353 12.9481 3.8871 0.0000
0.0669 0.43 2500 0.1352 12.8284 3.8283 0.0000
0.0665 0.52 3000 0.1351 12.8203 3.7833 0.0000
0.0649 0.61 3500 0.1349 12.8098 3.7824 0.0000
0.0607 0.69 4000 0.1347 12.8110 3.8105 0.0000
0.0636 0.78 4500 0.1345 12.7994 3.7893 0.0000
0.063 0.87 5000 0.1342 12.8319 3.8084 0.0000
0.0589 0.95 5500 0.1341 12.8807 3.8551 0.0000
0.0734 1.04 6000 0.1341 12.7691 3.7604 0.0000
0.0577 1.13 6500 0.1340 12.7645 3.7602 0.0000
0.052 1.21 7000 0.1340 12.7610 3.7655 0.0000
0.0626 1.3 7500 0.1339 12.7657 3.7593 0.0000
0.0617 1.39 8000 0.1338 12.7912 3.8268 0.0000
0.063 1.47 8500 0.1337 12.7343 3.7573 0.0000
0.0668 1.56 9000 0.1336 12.7308 3.7198 0.0000
0.0634 1.65 9500 0.1335 12.7215 3.7400 0.0000
0.0604 1.73 10000 0.1333 12.7192 3.7515 0.0000
0.0707 1.82 10500 0.1333 12.7052 3.7568 0.0000
0.0639 1.91 11000 0.1332 12.6983 3.7617 0.0000
0.0617 1.99 11500 0.1331 12.6936 3.7402 0.0000
0.0601 2.08 12000 0.1330 12.6901 3.7586 0.0000
0.0632 2.17 12500 0.1330 12.6785 3.7279 0.0000
0.0626 2.25 13000 0.1330 12.6808 3.7333 0.0000
0.066 2.34 13500 0.1329 12.6704 3.7512 0.0000
0.0674 2.42 14000 0.1329 12.6599 3.7384 0.0000
0.0637 2.51 14500 0.1329 12.6797 3.7428 0.0000
0.0641 2.6 15000 0.1329 12.6750 3.7305 0.0000

Framework versions

  • Transformers 4.36.0.dev0
  • Pytorch 2.1.1+cu121
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
14
Safetensors
Model size
242M params
Tensor type
F32
·

Finetuned from

Dataset used to train artyomboyko/whisper-small-ru-v2

Space using artyomboyko/whisper-small-ru-v2 1

Evaluation results