tarekziade commited on
Commit
6caea98
1 Parent(s): 6fafb82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -66
README.md CHANGED
@@ -1,66 +1,82 @@
1
- ---
2
- license: apache-2.0
3
- base_model: mozilla/distilvit
4
- tags:
5
- - generated_from_trainer
6
- metrics:
7
- - rouge
8
- model-index:
9
- - name: distilvit
10
- results: []
11
- ---
12
-
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
-
16
- # distilvit
17
-
18
- This model is a fine-tuned version of [mozilla/distilvit](https://huggingface.co/mozilla/distilvit) on an unknown dataset.
19
- It achieves the following results on the evaluation set:
20
- - Gen Len: 10.6487
21
- - Loss: 0.1739
22
- - Meteor: 0.4120
23
- - Rouge1: 50.0916
24
- - Rouge2: 24.7223
25
- - Rougel: 46.9416
26
- - Rougelsum: 46.9372
27
-
28
- ## Model description
29
-
30
- More information needed
31
-
32
- ## Intended uses & limitations
33
-
34
- More information needed
35
-
36
- ## Training and evaluation data
37
-
38
- More information needed
39
-
40
- ## Training procedure
41
-
42
- ### Training hyperparameters
43
-
44
- The following hyperparameters were used during training:
45
- - learning_rate: 5e-05
46
- - train_batch_size: 100
47
- - eval_batch_size: 100
48
- - seed: 42
49
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
50
- - lr_scheduler_type: linear
51
- - num_epochs: 1
52
-
53
- ### Training results
54
-
55
- | Training Loss | Epoch | Step | Gen Len | Validation Loss | Meteor | Rouge1 | Rouge2 | Rougel | Rougelsum |
56
- |:-------------:|:------:|:----:|:-------:|:---------------:|:------:|:-------:|:-------:|:-------:|:---------:|
57
- | No log | 0.3891 | 100 | 10.4163 | 0.1764 | 0.4117 | 50.0198 | 24.6331 | 46.9071 | 46.8907 |
58
- | No log | 0.7782 | 200 | 10.6487 | 0.1739 | 0.4120 | 50.0916 | 24.7223 | 46.9416 | 46.9372 |
59
-
60
-
61
- ### Framework versions
62
-
63
- - Transformers 4.40.2
64
- - Pytorch 2.3.0+cu121
65
- - Datasets 2.19.1
66
- - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - image-to-text
4
+ - image-captioning
5
+ license: apache-2.0
6
+ metrics:
7
+ - rouge
8
+ datasets:
9
+ - nlphuji/flickr30k
10
+ widget:
11
+ - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/savanna.jpg
12
+ example_title: Savanna
13
+ - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/football-match.jpg
14
+ example_title: Football Match
15
+ - src: https://huggingface.co/datasets/mishig/sample_images/resolve/main/airport.jpg
16
+ example_title: Airport
17
+ base_model:
18
+ - google/vit-base-patch16-224-in21k
19
+
20
+ model-index:
21
+ - name: mozilla/distilvit
22
+ results:
23
+ - task:
24
+ type: image-to-text
25
+ name: Image To Text
26
+ dataset:
27
+ name: nlphuji/flickr30k
28
+ type: nlphuji/flickr30k
29
+ metrics:
30
+ - name: ROUGE-1
31
+ type: rouge
32
+ value: 43.006
33
+ verified: true
34
+ - name: ROUGE-2
35
+ type: rouge
36
+ value: 16.9939
37
+ verified: true
38
+ - name: ROUGE-L
39
+ type: rouge
40
+ value: 38.8923
41
+ verified: true
42
+ - name: ROUGE-LSUM
43
+ type: rouge
44
+ value: 38.8877
45
+ verified: true
46
+ - name: loss
47
+ type: loss
48
+ value: 0.19939416646957397
49
+ - name: gen_len
50
+ type: gen_len
51
+ value: 11.327256736227712
52
+ verified: true
53
+ ---
54
+
55
+ # distilvit
56
+
57
+ This model is a work in progress. Fine-tuned version of those base models:
58
+
59
+ - a VIT model for the image encoder: https://huggingface.co/google/vit-base-patch16-224-in21k
60
+ - a Distilled GPT-2 model for the text decoder: https://huggingface.co/distilbert/distilgpt2
61
+
62
+ This model was trained on:
63
+
64
+ - Flickr30k : https://huggingface.co/datasets/nlphuji/flickr30k
65
+ - COCO 2017: https://cocodataset.org
66
+
67
+ You can get that checkpoint using the 3083a3cef6e3c8dd90df3f088074bbe836b0f403 commit.
68
+
69
+ It was then further fine-tuned on :
70
+
71
+ - Flickr30k debiased: https://huggingface.co/datasets/Mozilla/flickr30k-transformed-captions
72
+ - DocOrNot: https://huggingface.co/datasets/Mozilla/docornot
73
+
74
+ You can find the code used to create the model here: https://github.com/mozilla/distilvit
75
+
76
+
77
+ ### Framework versions
78
+
79
+ - Transformers 4.40.2
80
+ - Pytorch 2.3.0+cu121
81
+ - Datasets 2.19.1
82
+ - Tokenizers 0.19.1