File size: 4,945 Bytes
45f1d32
 
 
7c3bbf1
45f1d32
 
 
 
 
 
 
 
 
 
 
 
 
 
b1d5609
45f1d32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b1d5609
 
 
 
45f1d32
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
---
tags:
- generated_from_trainer
base_model: NourFakih/Vit-GPT2-COCO2017Flickr-85k-11
metrics:
- rouge
model-index:
- name: Vit-GPT2-COCO2017Flickr-85k-11
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Vit-GPT2-COCO2017Flickr-85k-11

This model is a fine-tuned version of [NourFakih/Vit-GPT2-COCO2017Flickr-85k-11](https://huggingface.co/NourFakih/Vit-GPT2-COCO2017Flickr-85k-11) on an unknown dataset.
It achieves the following results on the evaluation set:
- Gen Len: 12.1495
- Loss: 0.5306
- Rouge1: 40.0349
- Rouge2: 14.6303
- Rougel: 36.2382
- Rougelsum: 36.2213

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3.0

### Training results

| Training Loss | Epoch  | Step  | Gen Len | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum |
|:-------------:|:------:|:-----:|:-------:|:---------------:|:-------:|:-------:|:-------:|:---------:|
| 0.378         | 0.0933 | 500   | 11.7725 | 0.4693          | 40.2274 | 15.0119 | 36.4563 | 36.4656   |
| 0.3748        | 0.1866 | 1000  | 12.1668 | 0.4640          | 40.199  | 15.321  | 36.4279 | 36.4457   |
| 0.374         | 0.2799 | 1500  | 11.8    | 0.4669          | 39.9523 | 15.0587 | 36.3639 | 36.375    |
| 0.3721        | 0.3732 | 2000  | 11.2095 | 0.4645          | 40.3597 | 15.2173 | 36.6938 | 36.705    |
| 0.3673        | 0.4665 | 2500  | 11.9343 | 0.4632          | 40.3875 | 15.2532 | 36.5923 | 36.6182   |
| 0.365         | 0.5599 | 3000  | 12.2647 | 0.4623          | 39.9395 | 15.0315 | 36.1682 | 36.1781   |
| 0.3652        | 0.6532 | 3500  | 11.8965 | 0.4611          | 39.8792 | 14.9961 | 36.2488 | 36.2734   |
| 0.3601        | 0.7465 | 4000  | 12.0545 | 0.4625          | 40.57   | 15.2972 | 36.8012 | 36.8227   |
| 0.3574        | 0.8398 | 4500  | 11.7287 | 0.4608          | 40.3276 | 15.1742 | 36.7679 | 36.7575   |
| 0.351         | 0.9331 | 5000  | 11.7662 | 0.4650          | 40.7345 | 15.5295 | 37.0769 | 37.0911   |
| 0.3322        | 1.0264 | 5500  | 12.06   | 0.4831          | 40.5582 | 15.2954 | 36.6682 | 36.6694   |
| 0.2914        | 1.1197 | 6000  | 11.8405 | 0.4902          | 40.054  | 15.019  | 36.5476 | 36.556    |
| 0.2945        | 1.2130 | 6500  | 11.8422 | 0.4863          | 40.3126 | 15.3154 | 36.61   | 36.6146   |
| 0.2845        | 1.3063 | 7000  | 12.0445 | 0.4883          | 40.228  | 15.0904 | 36.3179 | 36.3086   |
| 0.2879        | 1.3996 | 7500  | 11.9358 | 0.4833          | 40.6501 | 15.5682 | 36.8945 | 36.8823   |
| 0.2859        | 1.4930 | 8000  | 12.1743 | 0.4833          | 40.3187 | 15.0418 | 36.3561 | 36.3582   |
| 0.2844        | 1.5863 | 8500  | 12.1702 | 0.4884          | 40.2896 | 15.1032 | 36.4039 | 36.3862   |
| 0.2838        | 1.6796 | 9000  | 11.9588 | 0.4902          | 40.3419 | 15.1863 | 36.4631 | 36.4728   |
| 0.2789        | 1.7729 | 9500  | 12.0567 | 0.4865          | 40.6284 | 15.3404 | 36.7035 | 36.6876   |
| 0.2758        | 1.8662 | 10000 | 11.823  | 0.4909          | 40.1138 | 14.9247 | 36.4884 | 36.4836   |
| 0.2741        | 1.9595 | 10500 | 11.9537 | 0.4892          | 40.3204 | 14.9594 | 36.539  | 36.5311   |
| 0.253         | 2.0529 | 11000 | 11.9712 | 0.5201          | 40.0224 | 14.9662 | 36.3433 | 36.3705   |
| 0.2261        | 2.1462 | 11500 | 11.8918 | 0.5248          | 39.698  | 14.3092 | 35.9144 | 35.9107   |
| 0.2245        | 2.2395 | 12000 | 12.0252 | 0.5204          | 40.136  | 14.8487 | 36.4154 | 36.3989   |
| 0.2293        | 2.3328 | 12500 | 11.8622 | 0.5261          | 39.9269 | 14.6665 | 36.2594 | 36.2517   |
| 0.2255        | 2.4261 | 13000 | 11.9165 | 0.5217          | 40.1403 | 14.7327 | 36.4161 | 36.4139   |
| 0.228         | 2.5195 | 13500 | 11.9477 | 0.5267          | 39.7979 | 14.4362 | 36.0457 | 36.0611   |
| 0.2233        | 2.6128 | 14000 | 12.0495 | 0.5299          | 39.8343 | 14.4579 | 36.0728 | 36.0824   |
| 0.2239        | 2.7062 | 14500 | 12.1308 | 0.5274          | 39.9561 | 14.5286 | 36.1101 | 36.1017   |
| 0.2254        | 2.7995 | 15000 | 12.0845 | 0.5292          | 39.9252 | 14.5215 | 36.1396 | 36.1203   |
| 0.2182        | 2.8928 | 15500 | 12.115  | 0.5297          | 39.9487 | 14.5406 | 36.1582 | 36.1321   |
| 0.221         | 2.9861 | 16000 | 12.1495 | 0.5306          | 40.0349 | 14.6303 | 36.2382 | 36.2213   |


### Framework versions

- Transformers 4.41.2
- Pytorch 2.1.2
- Datasets 2.19.2
- Tokenizers 0.19.1