Vit-GPT2-COCO2017Flickr-80k-08

This model is a fine-tuned version of NourFakih/Vit-GPT2-COCO2017Flickr-40k-04 on an unknown dataset. It achieves the following results on the evaluation set:

Gen Len: 12.0243
Loss: 0.5354
Rouge1: 40.114
Rouge2: 14.6699
Rougel: 36.1001
Rougelsum: 36.1128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Gen Len	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
0.3691	0.1	500	11.7758	0.4730	39.8086	14.7674	36.1546	36.1739
0.3706	0.2	1000	11.5977	0.4739	39.8972	14.9064	36.1193	36.138
0.3709	0.3	1500	11.7103	0.4759	39.9874	14.8528	36.3155	36.3317
0.3721	0.4	2000	12.175	0.4678	39.7192	14.5844	35.8447	35.8728
0.3655	0.5	2500	11.9002	0.4684	40.3132	15.1157	36.5749	36.5823
0.3623	0.6	3000	12.025	0.4672	40.1643	14.978	36.3002	36.3232
0.3676	0.7	3500	11.815	0.4623	40.5036	15.3751	36.8369	36.867
0.3613	0.8	4000	12.054	0.4647	40.4078	15.3105	36.65	36.6732
0.3539	0.9	4500	11.904	0.4634	40.3794	15.233	36.7155	36.7435
0.3481	1.0	5000	11.738	0.4644	40.037	14.8477	36.3648	36.3903
0.2889	1.1	5500	11.55	0.4897	40.1394	14.7595	36.4428	36.4696
0.2908	1.2	6000	11.9823	0.4865	40.0479	14.8181	36.316	36.3519
0.2882	1.3	6500	11.7945	0.4863	40.5912	15.3128	36.7638	36.7755
0.2901	1.4	7000	11.87	0.4868	40.3138	14.9695	36.5032	36.5211
0.2857	1.5	7500	11.776	0.4834	40.2242	14.9881	36.5381	36.5607
0.279	1.6	8000	12.0132	0.4999	40.2751	15.0173	36.4172	36.4257
0.281	1.7	8500	11.7685	0.4951	40.1172	14.8119	36.2966	36.296
0.2831	1.8	9000	12.2293	0.4979	39.9913	14.7427	36.1539	36.1517
0.2799	1.9	9500	11.8718	0.4911	40.5123	15.09	36.7528	36.7622
0.2778	2.0	10000	12.0262	0.4929	40.5005	15.1027	36.6202	36.6327
0.2318	2.1	10500	12.133	0.5237	40.1565	14.8022	36.1946	36.2074
0.2279	2.2	11000	11.92	0.5278	40.5801	15.0843	36.7832	36.8021
0.2272	2.3	11500	11.8057	0.5284	40.2332	14.8728	36.4401	36.4343
0.2308	2.4	12000	11.9518	0.5263	39.9961	14.6475	36.035	36.0528
0.2262	2.5	12500	11.9347	0.5322	40.3373	14.9137	36.3692	36.3718
0.2233	2.6	13000	11.9147	0.5329	40.1924	14.776	36.1644	36.1593
0.223	2.7	13500	11.9927	0.5370	40.3211	14.9563	36.3211	36.3345
0.2241	2.8	14000	11.9367	0.5365	40.0897	14.6372	36.1484	36.1606
0.2257	2.9	14500	12.0407	0.5332	40.2316	14.741	36.1795	36.1866
0.2201	3.0	15000	12.0243	0.5354	40.114	14.6699	36.1001	36.1128

Framework versions

Transformers 4.41.2
Pytorch 2.1.2
Datasets 2.19.2
Tokenizers 0.19.1

NourFakih
/

Vit-GPT2-COCO2017Flickr-80k-08

Vit-GPT2-COCO2017Flickr-80k-08

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for NourFakih/Vit-GPT2-COCO2017Flickr-80k-08

Evaluation results