Edit model card

Vit-GPT2-COCO2017Flickr-80k-08

This model is a fine-tuned version of NourFakih/Vit-GPT2-COCO2017Flickr-80k-08 on an unknown dataset. It achieves the following results on the evaluation set:

  • Gen Len: 12.0243
  • Loss: 0.5354
  • Rouge1: 40.114
  • Rouge2: 14.6699
  • Rougel: 36.1001
  • Rougelsum: 36.1128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge1 Rouge2 Rougel Rougelsum
0.3691 0.1 500 11.7758 0.4730 39.8086 14.7674 36.1546 36.1739
0.3706 0.2 1000 11.5977 0.4739 39.8972 14.9064 36.1193 36.138
0.3709 0.3 1500 11.7103 0.4759 39.9874 14.8528 36.3155 36.3317
0.3721 0.4 2000 12.175 0.4678 39.7192 14.5844 35.8447 35.8728
0.3655 0.5 2500 11.9002 0.4684 40.3132 15.1157 36.5749 36.5823
0.3623 0.6 3000 12.025 0.4672 40.1643 14.978 36.3002 36.3232
0.3676 0.7 3500 11.815 0.4623 40.5036 15.3751 36.8369 36.867
0.3613 0.8 4000 12.054 0.4647 40.4078 15.3105 36.65 36.6732
0.3539 0.9 4500 11.904 0.4634 40.3794 15.233 36.7155 36.7435
0.3481 1.0 5000 11.738 0.4644 40.037 14.8477 36.3648 36.3903
0.2889 1.1 5500 11.55 0.4897 40.1394 14.7595 36.4428 36.4696
0.2908 1.2 6000 11.9823 0.4865 40.0479 14.8181 36.316 36.3519
0.2882 1.3 6500 11.7945 0.4863 40.5912 15.3128 36.7638 36.7755
0.2901 1.4 7000 11.87 0.4868 40.3138 14.9695 36.5032 36.5211
0.2857 1.5 7500 11.776 0.4834 40.2242 14.9881 36.5381 36.5607
0.279 1.6 8000 12.0132 0.4999 40.2751 15.0173 36.4172 36.4257
0.281 1.7 8500 11.7685 0.4951 40.1172 14.8119 36.2966 36.296
0.2831 1.8 9000 12.2293 0.4979 39.9913 14.7427 36.1539 36.1517
0.2799 1.9 9500 11.8718 0.4911 40.5123 15.09 36.7528 36.7622
0.2778 2.0 10000 12.0262 0.4929 40.5005 15.1027 36.6202 36.6327
0.2318 2.1 10500 12.133 0.5237 40.1565 14.8022 36.1946 36.2074
0.2279 2.2 11000 11.92 0.5278 40.5801 15.0843 36.7832 36.8021
0.2272 2.3 11500 11.8057 0.5284 40.2332 14.8728 36.4401 36.4343
0.2308 2.4 12000 11.9518 0.5263 39.9961 14.6475 36.035 36.0528
0.2262 2.5 12500 11.9347 0.5322 40.3373 14.9137 36.3692 36.3718
0.2233 2.6 13000 11.9147 0.5329 40.1924 14.776 36.1644 36.1593
0.223 2.7 13500 11.9927 0.5370 40.3211 14.9563 36.3211 36.3345
0.2241 2.8 14000 11.9367 0.5365 40.0897 14.6372 36.1484 36.1606
0.2257 2.9 14500 12.0407 0.5332 40.2316 14.741 36.1795 36.1866
0.2201 3.0 15000 12.0243 0.5354 40.114 14.6699 36.1001 36.1128

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
86
Safetensors
Model size
239M params
Tensor type
F32
·
Unable to determine this model’s pipeline type. Check the docs .

Finetuned from