metadata

license: apache-2.0
base_model: NourFakih/image-captioning-Vit-GPT2-Flickr8k
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: Vit-GPT2-COCO2017Sample-Flickr8k
    results: []

Vit-GPT2-COCO2017Sample-Flickr8k

This model is a fine-tuned version of NourFakih/image-captioning-Vit-GPT2-Flickr8k on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2344
Rouge1: 41.2779
Rouge2: 15.8081
Rougel: 37.3177
Rougelsum: 37.2772
Gen Len: 11.568

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3.0

Training results

Training Loss	Epoch	Step	Gen Len	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
0.2485	0.08	500	11.382	0.2394	40.2445	13.9525	36.1329	36.1261
0.2336	0.16	1000	11.346	0.2376	39.9631	14.4981	36.1164	36.1265
0.2282	0.24	1500	11.07	0.2367	40.3107	14.533	36.489	36.5082
0.2249	0.32	2000	10.87	0.2338	41.0525	15.3076	37.0189	37.0365
0.2302	0.4	2500	11.31	0.2329	40.7052	14.8288	36.9272	36.9197
0.2255	0.48	3000	11.05	0.2321	40.6896	15.2723	36.9654	36.9515
0.2225	0.56	3500	10.946	0.2305	40.705	15.4878	37.0456	37.0235
0.2233	0.64	4000	11.25	0.2303	41.0229	15.179	37.081	37.0924
0.2177	0.72	4500	11.08	0.2307	40.0156	14.2972	36.0288	36.043
0.2159	0.8	5000	11.336	0.2298	40.4042	15.2531	36.6967	36.7003
0.2189	0.88	5500	11.39	0.2282	40.167	14.4847	36.3855	36.3742
0.2171	0.96	6000	11.002	0.2269	40.8528	15.1811	37.0586	37.0403
0.1962	1.04	6500	11.598	0.2296	40.6676	14.9888	36.7796	36.7703
0.1835	1.12	7000	11.022	0.2311	40.6188	15.2743	36.8519	36.8263
0.1835	1.2	7500	11.248	0.2289	40.6466	15.1727	36.6626	36.6427
0.1864	1.28	8000	11.408	0.2298	40.2408	15.0179	36.5594	36.5756
0.1838	1.36	8500	11.238	0.2295	41.0772	15.2152	37.0647	37.0648
0.1827	1.44	9000	11.28	0.2299	40.3263	14.9976	36.6444	36.6292
0.1828	1.52	9500	11.132	0.2299	40.9308	15.181	36.9028	36.8909
0.179	1.61	10000	11.164	0.2287	40.7406	15.2746	36.85	36.8748
0.1849	1.69	10500	10.988	0.2281	40.931	15.6479	37.0222	37.0071
0.1794	1.77	11000	11.218	0.2281	41.5198	15.9659	37.3709	37.386
0.1787	1.85	11500	11.274	0.2278	40.4006	14.9496	36.4608	36.4675
0.1798	1.93	12000	11.154	0.2279	41.3118	15.4673	37.4917	37.5101
0.1803	2.01	12500	11.23	0.2282	40.5652	15.1467	36.7946	36.7809
0.1519	2.09	13000	11.498	0.2361	40.8978	15.0865	36.7157	36.728
0.1515	2.17	13500	11.37	0.2360	40.9809	15.5877	37.0104	36.9942
0.1519	2.25	14000	11.504	0.2359	40.7947	15.3254	36.9574	36.9431
0.1543	2.33	14500	0.2346	40.7724	15.1837	36.9003	36.848	11.586
0.1548	2.41	15000	0.2355	40.7237	15.2394	37.0767	37.0405	11.294
0.1507	2.49	15500	0.2353	41.2661	15.7703	37.3669	37.32	11.308
0.1512	2.57	16000	0.2351	40.8777	15.2821	36.9591	36.9201	11.43
0.1525	2.65	16500	0.2350	40.6184	15.1824	36.655	36.6117	11.402
0.1522	2.73	17000	0.2343	41.2818	15.7174	37.3059	37.2695	11.502
0.1544	2.81	17500	0.2349	41.0821	15.5164	37.2206	37.1663	11.542
0.1498	2.89	18000	0.2346	41.2128	15.6698	37.2279	37.1874	11.582
0.1497	2.97	18500	0.2344	41.2779	15.8081	37.3177	37.2772	11.568

Framework versions

Transformers 4.39.3
Pytorch 2.1.2
Datasets 2.18.0
Tokenizers 0.15.2