op-Adafactor-model

This model is a fine-tuned version of anvitamanne/base-model on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 533.2568
  • Wer: 0.4035
  • Cer: 0.1668

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adafactor with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
313.1876 0.86 1000 515.6298 0.4013 0.1658
311.5102 1.72 2000 514.9396 0.4003 0.1651
302.9498 2.58 3000 520.3780 0.3961 0.1642
298.5719 3.44 4000 507.4060 0.3936 0.1631
286.1931 4.3 5000 509.8114 0.3953 0.1634
288.7149 5.17 6000 503.3621 0.3934 0.1630
283.2679 6.03 7000 514.0335 0.3949 0.1637
287.4118 6.89 8000 518.2812 0.3968 0.1642
291.8708 7.75 9000 521.2302 0.3948 0.1644
278.4833 8.61 10000 519.9163 0.3976 0.1645
277.7449 9.47 11000 524.9117 0.3966 0.1646
268.8857 10.33 12000 525.6909 0.3974 0.1652
290.9705 11.19 13000 531.0073 0.3990 0.1656
274.3229 12.05 14000 531.4063 0.4006 0.1661
272.6811 12.91 15000 533.5064 0.4025 0.1663
278.953 13.78 16000 534.8712 0.4031 0.1664
292.4625 14.64 17000 533.2568 0.4035 0.1668

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.1.2+cu118
  • Datasets 3.6.0
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anvitamanne/op-Adafactor-model

Finetuned
(8)
this model