--- tags: - generated_from_trainer base_model: NourFakih/Vit-GPT2-COCO2017Flickr-85k-11 metrics: - rouge model-index: - name: Vit-GPT2-COCO2017Flickr-85k-11 results: [] --- # Vit-GPT2-COCO2017Flickr-85k-11 This model is a fine-tuned version of [NourFakih/Vit-GPT2-COCO2017Flickr-85k-11](https://huggingface.co/NourFakih/Vit-GPT2-COCO2017Flickr-85k-11) on an unknown dataset. It achieves the following results on the evaluation set: - Gen Len: 12.1495 - Loss: 0.5306 - Rouge1: 40.0349 - Rouge2: 14.6303 - Rougel: 36.2382 - Rougelsum: 36.2213 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3.0 ### Training results | Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |:-------------:|:------:|:-----:|:-------:|:---------------:|:-------:|:-------:|:-------:|:---------:| | 0.378 | 0.0933 | 500 | 11.7725 | 0.4693 | 40.2274 | 15.0119 | 36.4563 | 36.4656 | | 0.3748 | 0.1866 | 1000 | 12.1668 | 0.4640 | 40.199 | 15.321 | 36.4279 | 36.4457 | | 0.374 | 0.2799 | 1500 | 11.8 | 0.4669 | 39.9523 | 15.0587 | 36.3639 | 36.375 | | 0.3721 | 0.3732 | 2000 | 11.2095 | 0.4645 | 40.3597 | 15.2173 | 36.6938 | 36.705 | | 0.3673 | 0.4665 | 2500 | 11.9343 | 0.4632 | 40.3875 | 15.2532 | 36.5923 | 36.6182 | | 0.365 | 0.5599 | 3000 | 12.2647 | 0.4623 | 39.9395 | 15.0315 | 36.1682 | 36.1781 | | 0.3652 | 0.6532 | 3500 | 11.8965 | 0.4611 | 39.8792 | 14.9961 | 36.2488 | 36.2734 | | 0.3601 | 0.7465 | 4000 | 12.0545 | 0.4625 | 40.57 | 15.2972 | 36.8012 | 36.8227 | | 0.3574 | 0.8398 | 4500 | 11.7287 | 0.4608 | 40.3276 | 15.1742 | 36.7679 | 36.7575 | | 0.351 | 0.9331 | 5000 | 11.7662 | 0.4650 | 40.7345 | 15.5295 | 37.0769 | 37.0911 | | 0.3322 | 1.0264 | 5500 | 12.06 | 0.4831 | 40.5582 | 15.2954 | 36.6682 | 36.6694 | | 0.2914 | 1.1197 | 6000 | 11.8405 | 0.4902 | 40.054 | 15.019 | 36.5476 | 36.556 | | 0.2945 | 1.2130 | 6500 | 11.8422 | 0.4863 | 40.3126 | 15.3154 | 36.61 | 36.6146 | | 0.2845 | 1.3063 | 7000 | 12.0445 | 0.4883 | 40.228 | 15.0904 | 36.3179 | 36.3086 | | 0.2879 | 1.3996 | 7500 | 11.9358 | 0.4833 | 40.6501 | 15.5682 | 36.8945 | 36.8823 | | 0.2859 | 1.4930 | 8000 | 12.1743 | 0.4833 | 40.3187 | 15.0418 | 36.3561 | 36.3582 | | 0.2844 | 1.5863 | 8500 | 12.1702 | 0.4884 | 40.2896 | 15.1032 | 36.4039 | 36.3862 | | 0.2838 | 1.6796 | 9000 | 11.9588 | 0.4902 | 40.3419 | 15.1863 | 36.4631 | 36.4728 | | 0.2789 | 1.7729 | 9500 | 12.0567 | 0.4865 | 40.6284 | 15.3404 | 36.7035 | 36.6876 | | 0.2758 | 1.8662 | 10000 | 11.823 | 0.4909 | 40.1138 | 14.9247 | 36.4884 | 36.4836 | | 0.2741 | 1.9595 | 10500 | 11.9537 | 0.4892 | 40.3204 | 14.9594 | 36.539 | 36.5311 | | 0.253 | 2.0529 | 11000 | 11.9712 | 0.5201 | 40.0224 | 14.9662 | 36.3433 | 36.3705 | | 0.2261 | 2.1462 | 11500 | 11.8918 | 0.5248 | 39.698 | 14.3092 | 35.9144 | 35.9107 | | 0.2245 | 2.2395 | 12000 | 12.0252 | 0.5204 | 40.136 | 14.8487 | 36.4154 | 36.3989 | | 0.2293 | 2.3328 | 12500 | 11.8622 | 0.5261 | 39.9269 | 14.6665 | 36.2594 | 36.2517 | | 0.2255 | 2.4261 | 13000 | 11.9165 | 0.5217 | 40.1403 | 14.7327 | 36.4161 | 36.4139 | | 0.228 | 2.5195 | 13500 | 11.9477 | 0.5267 | 39.7979 | 14.4362 | 36.0457 | 36.0611 | | 0.2233 | 2.6128 | 14000 | 12.0495 | 0.5299 | 39.8343 | 14.4579 | 36.0728 | 36.0824 | | 0.2239 | 2.7062 | 14500 | 12.1308 | 0.5274 | 39.9561 | 14.5286 | 36.1101 | 36.1017 | | 0.2254 | 2.7995 | 15000 | 12.0845 | 0.5292 | 39.9252 | 14.5215 | 36.1396 | 36.1203 | | 0.2182 | 2.8928 | 15500 | 12.115 | 0.5297 | 39.9487 | 14.5406 | 36.1582 | 36.1321 | | 0.221 | 2.9861 | 16000 | 12.1495 | 0.5306 | 40.0349 | 14.6303 | 36.2382 | 36.2213 | ### Framework versions - Transformers 4.41.2 - Pytorch 2.1.2 - Datasets 2.19.2 - Tokenizers 0.19.1