Vit-GPT2-COCO2017Flickr-80k-08
This model is a fine-tuned version of NourFakih/Vit-GPT2-COCO2017Flickr-40k-04 on an unknown dataset. It achieves the following results on the evaluation set:
- Gen Len: 12.0243
- Loss: 0.5354
- Rouge1: 40.114
- Rouge2: 14.6699
- Rougel: 36.1001
- Rougelsum: 36.1128
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|---|
0.3691 | 0.1 | 500 | 11.7758 | 0.4730 | 39.8086 | 14.7674 | 36.1546 | 36.1739 |
0.3706 | 0.2 | 1000 | 11.5977 | 0.4739 | 39.8972 | 14.9064 | 36.1193 | 36.138 |
0.3709 | 0.3 | 1500 | 11.7103 | 0.4759 | 39.9874 | 14.8528 | 36.3155 | 36.3317 |
0.3721 | 0.4 | 2000 | 12.175 | 0.4678 | 39.7192 | 14.5844 | 35.8447 | 35.8728 |
0.3655 | 0.5 | 2500 | 11.9002 | 0.4684 | 40.3132 | 15.1157 | 36.5749 | 36.5823 |
0.3623 | 0.6 | 3000 | 12.025 | 0.4672 | 40.1643 | 14.978 | 36.3002 | 36.3232 |
0.3676 | 0.7 | 3500 | 11.815 | 0.4623 | 40.5036 | 15.3751 | 36.8369 | 36.867 |
0.3613 | 0.8 | 4000 | 12.054 | 0.4647 | 40.4078 | 15.3105 | 36.65 | 36.6732 |
0.3539 | 0.9 | 4500 | 11.904 | 0.4634 | 40.3794 | 15.233 | 36.7155 | 36.7435 |
0.3481 | 1.0 | 5000 | 11.738 | 0.4644 | 40.037 | 14.8477 | 36.3648 | 36.3903 |
0.2889 | 1.1 | 5500 | 11.55 | 0.4897 | 40.1394 | 14.7595 | 36.4428 | 36.4696 |
0.2908 | 1.2 | 6000 | 11.9823 | 0.4865 | 40.0479 | 14.8181 | 36.316 | 36.3519 |
0.2882 | 1.3 | 6500 | 11.7945 | 0.4863 | 40.5912 | 15.3128 | 36.7638 | 36.7755 |
0.2901 | 1.4 | 7000 | 11.87 | 0.4868 | 40.3138 | 14.9695 | 36.5032 | 36.5211 |
0.2857 | 1.5 | 7500 | 11.776 | 0.4834 | 40.2242 | 14.9881 | 36.5381 | 36.5607 |
0.279 | 1.6 | 8000 | 12.0132 | 0.4999 | 40.2751 | 15.0173 | 36.4172 | 36.4257 |
0.281 | 1.7 | 8500 | 11.7685 | 0.4951 | 40.1172 | 14.8119 | 36.2966 | 36.296 |
0.2831 | 1.8 | 9000 | 12.2293 | 0.4979 | 39.9913 | 14.7427 | 36.1539 | 36.1517 |
0.2799 | 1.9 | 9500 | 11.8718 | 0.4911 | 40.5123 | 15.09 | 36.7528 | 36.7622 |
0.2778 | 2.0 | 10000 | 12.0262 | 0.4929 | 40.5005 | 15.1027 | 36.6202 | 36.6327 |
0.2318 | 2.1 | 10500 | 12.133 | 0.5237 | 40.1565 | 14.8022 | 36.1946 | 36.2074 |
0.2279 | 2.2 | 11000 | 11.92 | 0.5278 | 40.5801 | 15.0843 | 36.7832 | 36.8021 |
0.2272 | 2.3 | 11500 | 11.8057 | 0.5284 | 40.2332 | 14.8728 | 36.4401 | 36.4343 |
0.2308 | 2.4 | 12000 | 11.9518 | 0.5263 | 39.9961 | 14.6475 | 36.035 | 36.0528 |
0.2262 | 2.5 | 12500 | 11.9347 | 0.5322 | 40.3373 | 14.9137 | 36.3692 | 36.3718 |
0.2233 | 2.6 | 13000 | 11.9147 | 0.5329 | 40.1924 | 14.776 | 36.1644 | 36.1593 |
0.223 | 2.7 | 13500 | 11.9927 | 0.5370 | 40.3211 | 14.9563 | 36.3211 | 36.3345 |
0.2241 | 2.8 | 14000 | 11.9367 | 0.5365 | 40.0897 | 14.6372 | 36.1484 | 36.1606 |
0.2257 | 2.9 | 14500 | 12.0407 | 0.5332 | 40.2316 | 14.741 | 36.1795 | 36.1866 |
0.2201 | 3.0 | 15000 | 12.0243 | 0.5354 | 40.114 | 14.6699 | 36.1001 | 36.1128 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.1.2
- Datasets 2.19.2
- Tokenizers 0.19.1
- Downloads last month
- 15
Inference API (serverless) does not yet support transformers models for this pipeline type.
Model tree for NourFakih/Vit-GPT2-COCO2017Flickr-80k-08
Base model
nlpconnect/vit-gpt2-image-captioning
Finetuned
NourFakih/Vit-GPT2-COCO2017Flickr-40k-04