Vit-GPT2-COCO2017Flickr-85k-09

This model is a fine-tuned version of NourFakih/Vit-GPT2-COCO2017Flickr-85k-09 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6343
Rouge1: 38.8156
Rouge2: 13.6737
Rougel: 34.9479
Rougelsum: 34.9604
Gen Len: 12.1285

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Gen Len	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
0.2429	0.0933	500	11.738	0.5351	39.4446	14.1599	35.6123	35.5846
0.2537	0.1866	1000	12.3488	0.5301	39.5332	14.4745	35.644	35.6159
0.2564	0.2799	1500	12.2455	0.5198	39.8297	14.555	35.8598	35.8344
0.2585	0.3732	2000	11.8575	0.5207	39.4558	14.0496	35.5597	35.526
0.2579	0.4665	2500	11.9733	0.5188	39.1359	14.125	35.4068	35.3709
0.2588	0.5599	3000	12.278	0.5196	39.0831	14.0658	35.4608	35.4283
0.2618	0.6532	3500	11.9942	0.5194	39.751	14.443	36.076	36.0475
0.2579	0.7465	4000	12.0512	0.5102	39.7601	14.5095	36.0252	35.9857
0.2569	0.8398	4500	11.6483	0.5199	39.398	13.8871	35.7218	35.6911
0.253	0.9331	5000	12.0198	0.5200	39.8951	14.4146	35.883	35.8507
0.2361	1.0264	5500	12.183	0.5605	39.3352	14.2234	35.3107	35.2772
0.2	1.1197	6000	11.8598	0.5702	39.2184	14.0096	35.5475	35.5042
0.2034	1.2130	6500	11.878	0.5543	39.7118	14.2757	35.7613	35.7316
0.1968	1.3063	7000	12.1725	0.5584	39.1847	13.9003	35.3962	35.3713
0.1986	1.3996	7500	11.8395	0.5572	39.4428	14.2672	35.7359	35.7093
0.1988	1.4930	8000	11.9932	0.5552	39.2719	14.0411	35.482	35.4833
0.1971	1.5864	8500	12.1003	0.5572	39.2681	14.1036	35.4466	35.4245
0.1978	1.6797	9000	12.1152	0.5667	39.2673	14.0918	35.4179	35.4169
0.1937	1.7730	9500	12.2208	0.5781	39.4115	14.1115	35.6952	35.6834
0.1897	1.8663	10000	11.8818	0.5754	39.2059	14.076	35.3392	35.3332
0.1898	1.9596	10500	11.8818	0.5720	39.4033	14.1447	35.598	35.5976
0.1685	2.0529	11000	12.0585	0.6186	38.4626	13.4695	34.7378	34.7294
0.1454	2.1462	11500	11.9448	0.6147	38.5335	13.5152	34.7075	34.7033
0.1434	2.2395	12000	12.1855	0.6229	39.0044	13.9276	35.2226	35.2116
0.1479	2.3328	12500	12.0273	0.6262	38.6281	13.5737	34.8247	34.8245
0.1452	2.4261	13000	12.0222	0.6243	38.9136	13.6727	35.0597	35.0643
0.1464	2.5195	13500	12.006	0.6309	38.9915	13.5041	34.9971	34.9991
0.1431	2.6128	14000	12.0602	0.6318	38.7595	13.5585	34.8308	34.834
0.1431	2.7061	14500	12.229	0.6277	38.8899	13.7343	34.9536	34.9513
0.1445	2.7995	15000	12.0343	0.6357	38.7681	13.5849	34.9764	34.9564
0.1379	2.8928	15500	0.6340	38.9196	13.6285	34.9761	34.9855	12.1242
0.1411	2.9861	16000	0.6343	38.8156	13.6737	34.9479	34.9604	12.1285

Framework versions

Transformers 4.41.2
Pytorch 2.1.2
Datasets 2.19.2
Tokenizers 0.19.1

NourFakih
/

Vit-GPT2-COCO2017Flickr-85k-09

Vit-GPT2-COCO2017Flickr-85k-09

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from

Spaces using NourFakih/Vit-GPT2-COCO2017Flickr-85k-09 4

Evaluation results

Vit-GPT2-COCO2017Flickr-85k-09

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Finetuned from NourFakih/Vit-GPT2-COCO2017Flickr-85k-09

Spaces using NourFakih/Vit-GPT2-COCO2017Flickr-85k-09 4

Evaluation results

Finetuned from