Edit model card

git-base-naruto

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0279
  • Wer Score: 7.0134

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 5
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 10
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
8.5536 2.2727 50 7.1184 52.625
6.2017 4.5455 100 5.0281 22.3527
4.1263 6.8182 150 2.9941 21.8616
2.2013 9.0909 200 1.2700 18.7321
0.7916 11.3636 250 0.3337 12.1607
0.1917 13.6364 300 0.0798 4.5357
0.0458 15.9091 350 0.0356 1.0
0.0142 18.1818 400 0.0278 7.25
0.0066 20.4545 450 0.0287 8.4196
0.0043 22.7273 500 0.0270 7.8795
0.0032 25.0 550 0.0272 7.2545
0.0027 27.2727 600 0.0273 7.0179
0.0023 29.5455 650 0.0271 7.2054
0.002 31.8182 700 0.0275 7.0580
0.0018 34.0909 750 0.0276 7.2589
0.0016 36.3636 800 0.0277 7.0312
0.0015 38.6364 850 0.0277 7.0759
0.0014 40.9091 900 0.0278 7.1071
0.0014 43.1818 950 0.0278 7.1161
0.0013 45.4545 1000 0.0279 6.9241
0.0013 47.7273 1050 0.0279 6.9911
0.0013 50.0 1100 0.0279 7.0134

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
177M params
Tensor type
F32
·

Finetuned from