--- license: apache-2.0 tags: - generated_from_trainer base_model: NourFakih/Vit-GPT2-COCO2017Flickr-40k-04 metrics: - rouge model-index: - name: Vit-GPT2-COCO2017Flickr-80k-08 results: [] --- # Vit-GPT2-COCO2017Flickr-80k-08 This model is a fine-tuned version of [NourFakih/Vit-GPT2-COCO2017Flickr-40k-04](https://huggingface.co/NourFakih/Vit-GPT2-COCO2017Flickr-40k-04) on an unknown dataset. It achieves the following results on the evaluation set: - Gen Len: 12.0243 - Loss: 0.5354 - Rouge1: 40.114 - Rouge2: 14.6699 - Rougel: 36.1001 - Rougelsum: 36.1128 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 3.0 ### Training results | Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | |:-------------:|:-----:|:-----:|:-------:|:---------------:|:-------:|:-------:|:-------:|:---------:| | 0.3691 | 0.1 | 500 | 11.7758 | 0.4730 | 39.8086 | 14.7674 | 36.1546 | 36.1739 | | 0.3706 | 0.2 | 1000 | 11.5977 | 0.4739 | 39.8972 | 14.9064 | 36.1193 | 36.138 | | 0.3709 | 0.3 | 1500 | 11.7103 | 0.4759 | 39.9874 | 14.8528 | 36.3155 | 36.3317 | | 0.3721 | 0.4 | 2000 | 12.175 | 0.4678 | 39.7192 | 14.5844 | 35.8447 | 35.8728 | | 0.3655 | 0.5 | 2500 | 11.9002 | 0.4684 | 40.3132 | 15.1157 | 36.5749 | 36.5823 | | 0.3623 | 0.6 | 3000 | 12.025 | 0.4672 | 40.1643 | 14.978 | 36.3002 | 36.3232 | | 0.3676 | 0.7 | 3500 | 11.815 | 0.4623 | 40.5036 | 15.3751 | 36.8369 | 36.867 | | 0.3613 | 0.8 | 4000 | 12.054 | 0.4647 | 40.4078 | 15.3105 | 36.65 | 36.6732 | | 0.3539 | 0.9 | 4500 | 11.904 | 0.4634 | 40.3794 | 15.233 | 36.7155 | 36.7435 | | 0.3481 | 1.0 | 5000 | 11.738 | 0.4644 | 40.037 | 14.8477 | 36.3648 | 36.3903 | | 0.2889 | 1.1 | 5500 | 11.55 | 0.4897 | 40.1394 | 14.7595 | 36.4428 | 36.4696 | | 0.2908 | 1.2 | 6000 | 11.9823 | 0.4865 | 40.0479 | 14.8181 | 36.316 | 36.3519 | | 0.2882 | 1.3 | 6500 | 11.7945 | 0.4863 | 40.5912 | 15.3128 | 36.7638 | 36.7755 | | 0.2901 | 1.4 | 7000 | 11.87 | 0.4868 | 40.3138 | 14.9695 | 36.5032 | 36.5211 | | 0.2857 | 1.5 | 7500 | 11.776 | 0.4834 | 40.2242 | 14.9881 | 36.5381 | 36.5607 | | 0.279 | 1.6 | 8000 | 12.0132 | 0.4999 | 40.2751 | 15.0173 | 36.4172 | 36.4257 | | 0.281 | 1.7 | 8500 | 11.7685 | 0.4951 | 40.1172 | 14.8119 | 36.2966 | 36.296 | | 0.2831 | 1.8 | 9000 | 12.2293 | 0.4979 | 39.9913 | 14.7427 | 36.1539 | 36.1517 | | 0.2799 | 1.9 | 9500 | 11.8718 | 0.4911 | 40.5123 | 15.09 | 36.7528 | 36.7622 | | 0.2778 | 2.0 | 10000 | 12.0262 | 0.4929 | 40.5005 | 15.1027 | 36.6202 | 36.6327 | | 0.2318 | 2.1 | 10500 | 12.133 | 0.5237 | 40.1565 | 14.8022 | 36.1946 | 36.2074 | | 0.2279 | 2.2 | 11000 | 11.92 | 0.5278 | 40.5801 | 15.0843 | 36.7832 | 36.8021 | | 0.2272 | 2.3 | 11500 | 11.8057 | 0.5284 | 40.2332 | 14.8728 | 36.4401 | 36.4343 | | 0.2308 | 2.4 | 12000 | 11.9518 | 0.5263 | 39.9961 | 14.6475 | 36.035 | 36.0528 | | 0.2262 | 2.5 | 12500 | 11.9347 | 0.5322 | 40.3373 | 14.9137 | 36.3692 | 36.3718 | | 0.2233 | 2.6 | 13000 | 11.9147 | 0.5329 | 40.1924 | 14.776 | 36.1644 | 36.1593 | | 0.223 | 2.7 | 13500 | 11.9927 | 0.5370 | 40.3211 | 14.9563 | 36.3211 | 36.3345 | | 0.2241 | 2.8 | 14000 | 11.9367 | 0.5365 | 40.0897 | 14.6372 | 36.1484 | 36.1606 | | 0.2257 | 2.9 | 14500 | 12.0407 | 0.5332 | 40.2316 | 14.741 | 36.1795 | 36.1866 | | 0.2201 | 3.0 | 15000 | 12.0243 | 0.5354 | 40.114 | 14.6699 | 36.1001 | 36.1128 | ### Framework versions - Transformers 4.41.2 - Pytorch 2.1.2 - Datasets 2.19.2 - Tokenizers 0.19.1