op-Adam-model

This model is a fine-tuned version of anvitamanne/base-model on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 554.7772
  • Wer: 0.4000
  • Cer: 0.1679

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
313.8935 0.86 1000 510.4115 0.4033 0.1652
314.8383 1.72 2000 526.0791 0.4025 0.1650
304.3626 2.58 3000 547.0184 0.3996 0.1658
296.3588 3.44 4000 499.8129 0.3998 0.1643
282.8626 4.3 5000 512.6412 0.4040 0.1653
282.5549 5.17 6000 539.9665 0.4036 0.1664
275.1887 6.03 7000 527.6870 0.3965 0.1643
277.8783 6.89 8000 531.9138 0.3980 0.1648
278.9985 7.75 9000 555.6723 0.3983 0.1683
264.2371 8.61 10000 549.4355 0.4094 0.1702
261.7719 9.47 11000 549.5638 0.4006 0.1674
252.521 10.33 12000 548.9603 0.3969 0.1660
274.3623 11.19 13000 563.7667 0.3966 0.1671
259.1104 12.05 14000 559.4063 0.3969 0.1668
257.2146 12.91 15000 560.8441 0.3991 0.1679
263.3659 13.78 16000 561.1415 0.4018 0.1682
274.104 14.64 17000 554.7772 0.4000 0.1679

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.0+cu118
  • Datasets 3.6.0
  • Tokenizers 0.15.2
Downloads last month
2
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anvitamanne/op-Adam-model

Finetuned
(8)
this model