Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

models

This model is a fine-tuned version of google/mt5-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2777

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss
9.9229 0.0699 500 0.9269
0.6121 0.1398 1000 0.4268
0.4573 0.2097 1500 0.3790
0.4395 0.2796 2000 0.3563
0.3979 0.3495 2500 0.3520
0.3914 0.4193 3000 0.3390
0.375 0.4892 3500 0.3214
0.3856 0.5591 4000 0.3217
0.3482 0.6290 4500 0.3143
0.3643 0.6989 5000 0.3090
0.3541 0.7688 5500 0.3067
0.3471 0.8387 6000 0.3024
0.3374 0.9086 6500 0.2975
0.3675 0.9785 7000 0.2936
0.3261 1.0484 7500 0.2915
0.3206 1.1183 8000 0.2928
0.3258 1.1881 8500 0.2848
0.3244 1.2580 9000 0.2866
0.3006 1.3279 9500 0.2853
0.3204 1.3978 10000 0.2821
0.3088 1.4677 10500 0.2827
0.2958 1.5376 11000 0.2820
0.3031 1.6075 11500 0.2789
0.3153 1.6774 12000 0.2812
0.2941 1.7473 12500 0.2786
0.3025 1.8172 13000 0.2790
0.2983 1.8871 13500 0.2775
0.3019 1.9569 14000 0.2777

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
1.23B params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for Amala3/OCR_mt5_beams

Base model

google/mt5-large
Finetuned
(41)
this model