Edit model card

wav2vec2-large-mms-1b-livvi-karelian-CodeSwitching

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3113
  • Wer: 0.4087
  • Cer: 0.0910

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.4219 4.5351 500 0.4570 0.5677 0.1335
0.5951 9.0703 1000 0.4008 0.5142 0.1186
0.5314 13.6054 1500 0.3725 0.4942 0.1126
0.4916 18.1406 2000 0.3626 0.4692 0.1067
0.4563 22.6757 2500 0.3465 0.4540 0.1035
0.4331 27.2109 3000 0.3310 0.4455 0.1010
0.4129 31.7460 3500 0.3283 0.4516 0.1019
0.394 36.2812 4000 0.3289 0.4482 0.0994
0.3715 40.8163 4500 0.3203 0.4374 0.0985
0.3646 45.3515 5000 0.3109 0.4327 0.0966
0.3508 49.8866 5500 0.3136 0.4276 0.0958
0.3376 54.4218 6000 0.3198 0.4246 0.0950
0.3283 58.9569 6500 0.3203 0.4232 0.0943
0.3222 63.4921 7000 0.3126 0.4134 0.0932
0.3104 68.0272 7500 0.3140 0.4168 0.0933
0.3026 72.5624 8000 0.3136 0.4110 0.0920
0.3003 77.0975 8500 0.3137 0.4175 0.0926
0.2896 81.6327 9000 0.3150 0.4107 0.0912
0.2885 86.1678 9500 0.3110 0.4090 0.0914
0.2869 90.7029 10000 0.3113 0.4087 0.0910

Framework versions

  • Transformers 4.41.0.dev0
  • Pytorch 2.2.2
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
965M params
Tensor type
FP16
·

Finetuned from