lr-5e5-model

This model is a fine-tuned version of anvitamanne/base-model on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 540.9777
  • Wer: 0.3898
  • Cer: 0.1646

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
324.3731 0.86 1000 509.3808 0.4014 0.1657
323.4149 1.72 2000 495.5074 0.4006 0.1639
324.4118 2.58 3000 503.3999 0.4025 0.1647
312.5412 3.44 4000 500.1373 0.4039 0.1656
298.6976 4.3 5000 501.8691 0.3958 0.1638
303.839 5.17 6000 511.4516 0.3931 0.1640
301.297 6.03 7000 512.8284 0.3999 0.1663
296.7412 6.89 8000 517.9861 0.3989 0.1668
310.3565 7.75 9000 519.5070 0.3960 0.1647
294.8242 8.61 10000 531.7615 0.3987 0.1661
278.929 9.47 11000 534.0803 0.3892 0.1636
287.4352 10.33 12000 533.1113 0.3911 0.1636
294.2136 11.19 13000 532.6003 0.3929 0.1647
289.0024 12.05 14000 537.3076 0.3921 0.1654
284.6558 12.91 15000 537.4019 0.3909 0.1648
283.6182 13.78 16000 539.5662 0.3913 0.1649
280.4244 14.64 17000 540.9777 0.3898 0.1646

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anvitamanne/lr-5e5-model

Finetuned
(8)
this model