Image_Captioner_Mimic

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0963
  • Rouge1: 32.528
  • Rouge2: 19.9922
  • Rougel: 31.403
  • Rougelsum: 31.9372
  • Gen Len: 12.5584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.0597 1.0 24457 0.0567 37.8657 27.8087 37.4596 37.752 9.9527
0.0533 2.0 48914 0.0526 39.2211 28.2036 38.5786 38.9976 10.7079
0.0507 3.0 73371 0.0499 39.3449 28.3875 38.7151 39.0449 10.2091
0.0457 4.0 97828 0.0479 39.8753 28.5 39.127 39.6178 11.2407
0.0419 5.0 122285 0.0461 40.0478 28.797 39.3201 39.7468 10.3153
0.0406 6.0 146742 0.0445 39.7923 28.4281 39.0583 39.4523 10.4186
0.0373 7.0 171199 0.0429 39.954 28.535 39.2226 39.6457 10.6640
0.0347 8.0 195656 0.0419 39.4329 28.0336 38.6815 39.0968 10.7775
0.031 9.0 220113 0.0411 39.4524 28.1057 38.6998 39.0906 10.8397
0.0286 10.0 244570 0.0407 39.1493 27.639 38.3784 38.8085 10.9530
0.0261 11.0 269027 0.0408 38.8083 27.2206 37.9679 38.422 11.2390
0.0249 12.0 293484 0.0412 38.3972 26.7316 37.5838 38.0409 11.4510
0.0214 13.0 317941 0.0424 37.785 26.3302 36.9553 37.3764 11.4482
0.0188 14.0 342398 0.0438 36.9552 25.3108 36.0278 36.4965 11.6232
0.0174 15.0 366855 0.0458 35.6476 23.9574 34.6526 35.1259 11.6605
0.0153 16.0 391312 0.0487 34.657 22.8337 33.5891 34.1343 12.2395
0.013 17.0 415769 0.0518 33.5548 21.1569 32.4899 33.0394 12.2604
0.0114 18.0 440226 0.0559 34.3809 22.0108 33.2698 33.8578 12.0861
0.01 19.0 464683 0.0601 32.9062 20.3145 31.8147 32.3802 12.5176
0.0081 20.0 489140 0.0651 32.9482 20.3862 31.865 32.3837 12.4577
0.0069 21.0 513597 0.0698 32.3054 19.764 31.2178 31.7592 12.4939
0.0057 22.0 538054 0.0751 31.7627 19.0106 30.6263 31.175 12.7530
0.0048 23.0 562511 0.0793 31.8295 19.255 30.6958 31.2314 12.6077
0.0041 24.0 586968 0.0834 32.1523 19.2017 30.9774 31.5383 12.7461
0.0032 25.0 611425 0.0870 32.5379 20.0041 31.3903 31.9037 12.6848
0.0025 26.0 635882 0.0903 32.6757 20.1388 31.5495 32.0827 12.5950
0.0023 27.0 660339 0.0927 32.0874 19.3546 30.9125 31.4675 12.6290
0.0019 28.0 684796 0.0947 32.6988 20.1847 31.5643 32.1143 12.5412
0.0017 29.0 709253 0.0958 32.4574 19.7702 31.2955 31.8608 12.5558
0.0014 30.0 733710 0.0963 32.528 19.9922 31.403 31.9372 12.5584

Framework versions

  • Transformers 4.37.1
  • Pytorch 1.13.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.1
Downloads last month
3
Safetensors
Model size
239M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.