Edit model card

git-base-coco-pokemon

This model is a fine-tuned version of microsoft/git-base-coco on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0496
  • Wer Score: 17.2413

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Wer Score
No log 0.53 50 4.1410 21.7574
No log 1.06 100 0.2080 0.4284
No log 1.6 150 0.0389 0.4503
No log 2.13 200 0.0314 0.4297
No log 2.66 250 0.0302 0.3935
No log 3.19 300 0.0297 3.6761
No log 3.72 350 0.0302 0.9729
No log 4.26 400 0.0294 3.9110
No log 4.79 450 0.0296 0.6968
0.9165 5.32 500 0.0307 14.7252
0.9165 5.85 550 0.0306 14.0787
0.9165 6.38 600 0.0314 17.9974
0.9165 6.91 650 0.0305 19.4271
0.9165 7.45 700 0.0313 18.76
0.9165 7.98 750 0.0321 17.1084
0.9165 8.51 800 0.0322 20.9123
0.9165 9.04 850 0.0321 21.0026
0.9165 9.57 900 0.0328 19.2103
0.9165 10.11 950 0.0336 19.4503
0.0124 10.64 1000 0.0354 19.6310
0.0124 11.17 1050 0.0350 17.3652
0.0124 11.7 1100 0.0355 18.2955
0.0124 12.23 1150 0.0375 19.4194
0.0124 12.77 1200 0.0362 18.2606
0.0124 13.3 1250 0.0375 19.8348
0.0124 13.83 1300 0.0380 18.6581
0.0124 14.36 1350 0.0383 19.2723
0.0124 14.89 1400 0.0407 18.8516
0.0124 15.43 1450 0.0406 19.0968
0.0049 15.96 1500 0.0406 18.4774
0.0049 16.49 1550 0.0419 18.5768
0.0049 17.02 1600 0.0435 19.8606
0.0049 17.55 1650 0.0437 19.6477
0.0049 18.09 1700 0.0445 19.2684
0.0049 18.62 1750 0.0443 18.6039
0.0049 19.15 1800 0.0432 17.8129
0.0049 19.68 1850 0.0455 18.9587
0.0049 20.21 1900 0.0448 18.28
0.0049 20.74 1950 0.0455 18.4477
0.0009 21.28 2000 0.0453 18.2542
0.0009 21.81 2050 0.0457 18.7458
0.0009 22.34 2100 0.0456 18.5239
0.0009 22.87 2150 0.0450 18.3523
0.0009 23.4 2200 0.0459 18.2658
0.0009 23.94 2250 0.0462 18.0916
0.0009 24.47 2300 0.0465 18.3265
0.0009 25.0 2350 0.0463 18.4245
0.0009 25.53 2400 0.0466 18.1948
0.0009 26.06 2450 0.0467 18.0090
0.0002 26.6 2500 0.0468 18.2155
0.0002 27.13 2550 0.0471 18.1639
0.0002 27.66 2600 0.0472 17.92
0.0002 28.19 2650 0.0472 17.9303
0.0002 28.72 2700 0.0474 17.8116
0.0002 29.26 2750 0.0476 17.9045
0.0002 29.79 2800 0.0477 17.4942
0.0002 30.32 2850 0.0477 17.6129
0.0002 30.85 2900 0.0479 17.3910
0.0002 31.38 2950 0.0480 17.6594
0.0001 31.91 3000 0.0480 17.5303
0.0001 32.45 3050 0.0481 17.4245
0.0001 32.98 3100 0.0483 17.4413
0.0001 33.51 3150 0.0483 17.4013
0.0001 34.04 3200 0.0484 17.3342
0.0001 34.57 3250 0.0485 17.2361
0.0001 35.11 3300 0.0486 17.3613
0.0001 35.64 3350 0.0487 17.2606
0.0001 36.17 3400 0.0488 17.4039
0.0001 36.7 3450 0.0488 17.2168
0.0001 37.23 3500 0.0489 17.2194
0.0001 37.77 3550 0.0488 17.3032
0.0001 38.3 3600 0.0489 17.3303
0.0001 38.83 3650 0.0490 17.3277
0.0001 39.36 3700 0.0490 17.3381
0.0001 39.89 3750 0.0491 17.3471
0.0001 40.43 3800 0.0492 17.3497
0.0001 40.96 3850 0.0492 17.3484
0.0001 41.49 3900 0.0493 17.3910
0.0001 42.02 3950 0.0491 17.3019
0.0 42.55 4000 0.0492 17.2942
0.0 43.09 4050 0.0493 17.2645
0.0 43.62 4100 0.0493 17.2387
0.0 44.15 4150 0.0493 17.2348
0.0 44.68 4200 0.0493 17.2490
0.0 45.21 4250 0.0494 17.2374
0.0 45.74 4300 0.0495 17.2568
0.0 46.28 4350 0.0495 17.2619
0.0 46.81 4400 0.0495 17.2310
0.0 47.34 4450 0.0496 17.2374
0.0 47.87 4500 0.0496 17.2426
0.0 48.4 4550 0.0496 17.2387
0.0 48.94 4600 0.0496 17.2335
0.0 49.47 4650 0.0496 17.2387
0.0 50.0 4700 0.0496 17.2413

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.0+cpu
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
12
Safetensors
Model size
177M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for Adithya7Shankar/git-base-coco-pokemon

Finetuned
(2)
this model