Edit model card

krishivoice

This model is a fine-tuned version of facebook/wav2vec2-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6148
  • Wer: 0.2809

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.3768 9.0909 1000 0.5920 0.4378
0.1392 18.1818 2000 0.5513 0.3537
0.0823 27.2727 3000 0.5021 0.3260
0.0566 36.3636 4000 0.5473 0.3163
0.0886 45.4545 5000 0.5637 0.3098
0.0368 54.5455 6000 0.5558 0.3004
0.0264 63.6364 7000 0.6315 0.2919
0.0214 72.7273 8000 0.5927 0.2902
0.0124 81.8182 9000 0.6936 0.2874
0.0132 90.9091 10000 0.6311 0.2858
0.0094 100.0 11000 0.6148 0.2809

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
4
Safetensors
Model size
94.4M params
Tensor type
F32
·

Finetuned from