VictorSanh commited on
Commit
7f05f40
2 Parent(s): 5608691 9d73c99

resolve conflicts

Browse files
Files changed (1) hide show
  1. README.md +36 -13
README.md CHANGED
@@ -160,20 +160,43 @@ As opposed to Flamingo, we did not train IDEFICS on video-text pairs datasets, a
160
 
161
  <img src="./assets/Figure_Evals_IDEFIX.png" width="55%">
162
 
163
- TODO: update this table
164
- | Model | Shots | VQAv2 (OE VQA acc) | OKVQA (OE VQA acc) | TextVQA (OE VQA acc) | VizWiz (OE VQA acc) | TextCaps (CIDEr) | Coco (CIDEr) | NoCaps (CIDEr) | Flickr (CIDEr) | ImageNet1k (accuracy) | VisDial (NDCG) | HatefulMemes (ROC AUC) | ScienceQA (accuracy) | RenderedSST2 (accuracy) | Winoground (group (text/image)) |
165
- |:-----------|--------:|---------------------:|---------------------:|-----------------------:|----------------------:|-------------------:|---------------:|-----------------:|-----------------:|------------------------:|-----------------:|-------------------------:|-----------------------:|--------------------------:|----------------------------------:|
166
- | IDEFIX 80B | 0 | 60.0 | 45.2 | 30.9 | 36.0 | 56.8 | 91.8 | 65.0 | 53.7 | 74.3 | 48.8 | 60.6 | 68.9 | 60.5 | 8.0 (18.8/22.5)|
167
- | | 4 | 63.4 | 52.3 | 34.7 | 45.8 | 77.9 | 109.3 | 101.1 | 68.9 | - | 48.6 | 58.7 | 66.3 | 63.9 | - |
168
- | | 8 | 64.5 | 55.2 | 35.4 | 49.3 | 82.5 | 113.9 | 104.7 | 74.3 | - | 48.1 | 57.8 | - | 64.3 | - |
169
- | | 16 | 65.4 | 56.8 | 36.3 | 51.5 | 85.2 | 116.6 | 105.6 | 76.8 | - | - | 56.0 | - | 66.9 | - |
170
- | | 32 | 66.0 | 58.0 | 37.0 | 52.6 | 86.1 | 116.5 | 106.3 | 78.9 | - | - | 54.3 | - | 68.0 | - |
171
  <br>
172
- | IDEFIX 9B | 0 | 50.9 | 38.4 | 25.9 | 35.5 | 25.4 | 46.0 | 36.8 | 27.3 | 70.7 | 48.7 | 51.7 | 44.2 | 61.8 | 5.0 (16.8/20.8)|
173
- | | 4 | 55.6 | 45.8 | 26.8 | 42.0 | 60.8 | 88.9 | 78.4 | 52.2 | - | 48.1 | 52.6 | 41.6 | 60.6 | - |
174
- | | 8 | 56.4 | 47.3 | 26.8 | 42.8 | 63.7 | 96.9 | 84.3 | 60.3 | - | 47.5 | 52.3 | - | 66.8 | - |
175
- | | 16 | 57.2 | 49.0 | 28.1 | 45.0 | 68.0 | 99.6 | 87.2 | 65.0 | - | - | 52.5 | - | 66.0 | - |
176
- | | 32 | 57.9 | 50.4 | 28.2 | 45.9 | 69.7 | 101.5 | 88.6 | 66.0 | - | - | 53.1 | - | 63.4 | - |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
 
178
  We also report results where the priming samples are selected to be similar (i.e. close in a vector space) to the queried instance.
179
 
 
160
 
161
  <img src="./assets/Figure_Evals_IDEFIX.png" width="55%">
162
 
163
+ | Model | Shots | VQAv2 (OE VQA acc) | OKVQA (OE VQA acc) | TextVQA (OE VQA acc) | VizWiz (OE VQA acc) | TextCaps (CIDEr) | Coco (CIDEr) | NoCaps (CIDEr) | Flickr (CIDEr) | VisDial (NDCG) | HatefulMemes (ROC AUC) | ScienceQA (accuracy) | RenderedSST2 (accuracy) | Winoground (group (text/image)) |
164
+ |:-----------|--------:|---------------------:|---------------------:|-----------------------:|----------------------:|-------------------:|---------------:|-----------------:|-----------------:|-----------------:|-------------------------:|-----------------------:|--------------------------:|----------------------------------:|
165
+ | IDEFIX 80B | 0 | 60.0 | 45.2 | 30.9 | 36.0 | 56.8 | 91.8 | 65.0 | 53.7 | 48.8 | 60.6 | 68.9 | 60.5 | 8.0 (18.8/22.5) |
166
+ | | 4 | 63.6 | 52.4 | 34.4 | 40.4 | 72.7 | 110.3 | 99.6 | 73.7 | 48.4 | 57.8 | 58.9 | 66.6 | - |
167
+ | | 8 | 64.8 | 55.1 | 35.7 | 46.1 | 77.6 | 114.3 | 105.7 | 76.6 | 47.9 | 58.2 | - | 67.8 | - |
168
+ | | 16 | 65.4 | 56.8 | 36.3 | 48.3 | 81.4 | 116.6 | 107.0 | 80.1 | - | 55.8 | - | 67.7 | - |
169
+ | | 32 | 65.9 | 57.8 | 36.7 | 50.0 | 82.7 | 116.6 | 107.5 | 81.1 | - | 52.5 | - | 67.3 | - |
 
170
  <br>
171
+ | IDEFIX 9B | 0 | 50.9 | 38.4 | 25.9 | 35.5 | 25.4 | 46.0 | 36.8 | 27.3 | 48.7 | 51.7 | 44.2 | 61.8 | 5.0 (16.8/20.8) |
172
+ | | 4 | 55.4 | 45.5 | 27.6 | 36.9 | 60.0 | 93.0 | 81.3 | 59.7 | 47.9 | 50.7 | 37.4 | 62.3 | - |
173
+ | | 8 | 56.4 | 47.7 | 27.5 | 40.4 | 63.2 | 97.0 | 86.8 | 61.9 | 47.6 | 51.0 | - | 66.3 | - |
174
+ | | 16 | 57.0 | 48.4 | 27.9 | 42.6 | 67.4 | 99.7 | 89.4 | 64.5 | - | 50.9 | - | 67.8 | - |
175
+ | | 32 | 57.9 | 49.6 | 28.3 | 43.7 | 68.1 | 98.0 | 90.5 | 64.4 | - | 49.8 | - | 67.0 | - |
176
+
177
+ Imagenet Evaluation:
178
+ | Model | Shots | Imagenet |
179
+ |:-----------|--------:|-----------:|
180
+ | IDEFIX 80B | 16, 1k support set | 65.4 |
181
+ | | 16, RICES 5k support set | 72.9 |
182
+ <br>
183
+ | IDEFIX 9B | 16, 1k support set | 53.5 |
184
+ | | 16, RICES 5k support set | 64.5 |
185
+
186
+ Fairness Evaluations:
187
+ | Model | Shots | FairFaceGender (accuracy) | FairFaceRace (accuracy) | FairFaceAge (accuracy) |
188
+ |:-----------|--------:|----------------------------:|--------------------------:|-------------------------:|
189
+ | IDEFIX 80B | 0 | 95.8 | 64.1 | 51.0 |
190
+ | | 4 | 95.2 | 48.8 | 50.6 |
191
+ | | 8 | 95.5 | 52.3 | 53.1 |
192
+ | | 16 | 95.7 | 47.6 | 52.8 |
193
+ | | 32 | 95.7 | 36.5 | 51.2 |
194
+ <br>
195
+ | IDEFIX 9B | 0 | 94.4 | 55.3 | 45.1 |
196
+ | | 4 | 93.9 | 35.3 | 44.3 |
197
+ | | 8 | 95.4 | 44.7 | 46.0 |
198
+ | | 16 | 95.8 | 43.0 | 46.1 |
199
+ | | 32 | 96.1 | 35.1 | 44.9 |
200
 
201
  We also report results where the priming samples are selected to be similar (i.e. close in a vector space) to the queried instance.
202