Vit-GPT2-COCO2017Flickr-85k-11
This model is a fine-tuned version of NourFakih/Vit-GPT2-COCO2017Flickr-85k-11 on an unknown dataset. It achieves the following results on the evaluation set:
- Gen Len: 12.1495
- Loss: 0.5306
- Rouge1: 40.0349
- Rouge2: 14.6303
- Rougel: 36.2382
- Rougelsum: 36.2213
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
---|---|---|---|---|---|---|---|---|
0.378 | 0.0933 | 500 | 11.7725 | 0.4693 | 40.2274 | 15.0119 | 36.4563 | 36.4656 |
0.3748 | 0.1866 | 1000 | 12.1668 | 0.4640 | 40.199 | 15.321 | 36.4279 | 36.4457 |
0.374 | 0.2799 | 1500 | 11.8 | 0.4669 | 39.9523 | 15.0587 | 36.3639 | 36.375 |
0.3721 | 0.3732 | 2000 | 11.2095 | 0.4645 | 40.3597 | 15.2173 | 36.6938 | 36.705 |
0.3673 | 0.4665 | 2500 | 11.9343 | 0.4632 | 40.3875 | 15.2532 | 36.5923 | 36.6182 |
0.365 | 0.5599 | 3000 | 12.2647 | 0.4623 | 39.9395 | 15.0315 | 36.1682 | 36.1781 |
0.3652 | 0.6532 | 3500 | 11.8965 | 0.4611 | 39.8792 | 14.9961 | 36.2488 | 36.2734 |
0.3601 | 0.7465 | 4000 | 12.0545 | 0.4625 | 40.57 | 15.2972 | 36.8012 | 36.8227 |
0.3574 | 0.8398 | 4500 | 11.7287 | 0.4608 | 40.3276 | 15.1742 | 36.7679 | 36.7575 |
0.351 | 0.9331 | 5000 | 11.7662 | 0.4650 | 40.7345 | 15.5295 | 37.0769 | 37.0911 |
0.3322 | 1.0264 | 5500 | 12.06 | 0.4831 | 40.5582 | 15.2954 | 36.6682 | 36.6694 |
0.2914 | 1.1197 | 6000 | 11.8405 | 0.4902 | 40.054 | 15.019 | 36.5476 | 36.556 |
0.2945 | 1.2130 | 6500 | 11.8422 | 0.4863 | 40.3126 | 15.3154 | 36.61 | 36.6146 |
0.2845 | 1.3063 | 7000 | 12.0445 | 0.4883 | 40.228 | 15.0904 | 36.3179 | 36.3086 |
0.2879 | 1.3996 | 7500 | 11.9358 | 0.4833 | 40.6501 | 15.5682 | 36.8945 | 36.8823 |
0.2859 | 1.4930 | 8000 | 12.1743 | 0.4833 | 40.3187 | 15.0418 | 36.3561 | 36.3582 |
0.2844 | 1.5863 | 8500 | 12.1702 | 0.4884 | 40.2896 | 15.1032 | 36.4039 | 36.3862 |
0.2838 | 1.6796 | 9000 | 11.9588 | 0.4902 | 40.3419 | 15.1863 | 36.4631 | 36.4728 |
0.2789 | 1.7729 | 9500 | 12.0567 | 0.4865 | 40.6284 | 15.3404 | 36.7035 | 36.6876 |
0.2758 | 1.8662 | 10000 | 11.823 | 0.4909 | 40.1138 | 14.9247 | 36.4884 | 36.4836 |
0.2741 | 1.9595 | 10500 | 11.9537 | 0.4892 | 40.3204 | 14.9594 | 36.539 | 36.5311 |
0.253 | 2.0529 | 11000 | 11.9712 | 0.5201 | 40.0224 | 14.9662 | 36.3433 | 36.3705 |
0.2261 | 2.1462 | 11500 | 11.8918 | 0.5248 | 39.698 | 14.3092 | 35.9144 | 35.9107 |
0.2245 | 2.2395 | 12000 | 12.0252 | 0.5204 | 40.136 | 14.8487 | 36.4154 | 36.3989 |
0.2293 | 2.3328 | 12500 | 11.8622 | 0.5261 | 39.9269 | 14.6665 | 36.2594 | 36.2517 |
0.2255 | 2.4261 | 13000 | 11.9165 | 0.5217 | 40.1403 | 14.7327 | 36.4161 | 36.4139 |
0.228 | 2.5195 | 13500 | 11.9477 | 0.5267 | 39.7979 | 14.4362 | 36.0457 | 36.0611 |
0.2233 | 2.6128 | 14000 | 12.0495 | 0.5299 | 39.8343 | 14.4579 | 36.0728 | 36.0824 |
0.2239 | 2.7062 | 14500 | 12.1308 | 0.5274 | 39.9561 | 14.5286 | 36.1101 | 36.1017 |
0.2254 | 2.7995 | 15000 | 12.0845 | 0.5292 | 39.9252 | 14.5215 | 36.1396 | 36.1203 |
0.2182 | 2.8928 | 15500 | 12.115 | 0.5297 | 39.9487 | 14.5406 | 36.1582 | 36.1321 |
0.221 | 2.9861 | 16000 | 12.1495 | 0.5306 | 40.0349 | 14.6303 | 36.2382 | 36.2213 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.1.2
- Datasets 2.19.2
- Tokenizers 0.19.1
- Downloads last month
- 17
Inference API (serverless) does not yet support transformers models for this pipeline type.
Model tree for NourFakih/Vit-GPT2-COCO2017Flickr-85k-11
Unable to build the model tree, the base model loops to the model itself. Learn more.