Text Generation
Transformers
PyTorch
bloom
feature-extraction
Eval Results
text-generation-inference
Inference Endpoints
Muennighoff commited on
Commit
68331cd
1 Parent(s): 538fcc1

Add evaluation (#8)

Browse files

- Add evaluation (ee589c92eb5cf17c617fb3145214de77e2de8c23)

Files changed (1) hide show
  1. README.md +1690 -4
README.md CHANGED
@@ -50,6 +50,1546 @@ language:
50
  - zht
51
  - zu
52
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  ---
54
 
55
  <h1 style='text-align: center '>BLOOM LM</h1>
@@ -453,7 +1993,7 @@ Includes:
453
  And multiple different metrics for specific tasks. _(More evaluation metrics forthcoming upon completion of evaluation protocol.)_
454
 
455
  ### Factors
456
- *This section lists some different aspects of what BLOOM models. Its focus is on those aspects that are likely to give rise to high variance in model behavior.*
457
 
458
  - Language, such as English or Yoruba
459
 
@@ -464,6 +2004,154 @@ And multiple different metrics for specific tasks. _(More evaluation metrics for
464
  ### Results
465
  *Results are based on the [Factors](#factors) and [Metrics](#metrics).*
466
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
467
  **Train-time Evaluation:**
468
 
469
  As of 25.May.2022, 15:00 PST:
@@ -474,8 +2162,6 @@ As of 25.May.2022, 15:00 PST:
474
 
475
  - Perplexity: 8.9
476
 
477
- (More evaluation scores forthcoming at the end of model training.)
478
-
479
  </details>
480
  <p>&nbsp;</p>
481
 
@@ -561,5 +2247,5 @@ Initial prompting experiments using interim checkpoints: https://huggingface.co/
561
  ## Model Card Authors
562
  *Ordered roughly chronologically and by amount of time spent.*
563
 
564
- Margaret Mitchell, Giada Pistilli, Yacine Jernite, Ezinwanne Ozoani, Marissa Gerchick, Nazneen Rajani, Sasha Luccioni, Irene Solaiman, Maraim Masoud, Somaieh Nikpoor, Carlos Muñoz Ferrandis, Stas Bekman, Christopher Akiki, Danish Contractor, David Lansky, Angelina McMillan-Major, Tristan Thrush, Suzana Ilić, Gérard Dupont, Shayne Longpre, Manan Dey, Stella Biderman, Douwe Kiela, Emi Baylor, Teven Le Scao, Aaron Gokaslan, Julien Launay
565
 
 
50
  - zht
51
  - zu
52
  pipeline_tag: text-generation
53
+ model-index:
54
+ - name: bloom
55
+ results:
56
+ - task:
57
+ type: text-generation
58
+ name: text generation
59
+ dataset:
60
+ name: arc_challenge
61
+ type: arc_challenge
62
+ metrics:
63
+ - name: acc
64
+ type: acc
65
+ value: 0.27986348122866894
66
+ verified: false
67
+ - task:
68
+ type: text-generation
69
+ name: text generation
70
+ dataset:
71
+ name: arc_easy
72
+ type: arc_easy
73
+ metrics:
74
+ - name: acc
75
+ type: acc
76
+ value: 0.5946969696969697
77
+ verified: false
78
+ - task:
79
+ type: text-generation
80
+ name: text generation
81
+ dataset:
82
+ name: axb
83
+ type: axb
84
+ metrics:
85
+ - name: acc
86
+ type: acc
87
+ value: 0.4433876811594203
88
+ verified: false
89
+ - task:
90
+ type: text-generation
91
+ name: text generation
92
+ dataset:
93
+ name: axg
94
+ type: axg
95
+ metrics:
96
+ - name: acc
97
+ type: acc
98
+ value: 0.5
99
+ verified: false
100
+ - task:
101
+ type: text-generation
102
+ name: text generation
103
+ dataset:
104
+ name: boolq
105
+ type: boolq
106
+ metrics:
107
+ - name: acc
108
+ type: acc
109
+ value: 0.6165137614678899
110
+ verified: false
111
+ - task:
112
+ type: text-generation
113
+ name: text generation
114
+ dataset:
115
+ name: cb
116
+ type: cb
117
+ metrics:
118
+ - name: acc
119
+ type: acc
120
+ value: 0.30357142857142855
121
+ verified: false
122
+ - task:
123
+ type: text-generation
124
+ name: text generation
125
+ dataset:
126
+ name: cola
127
+ type: cola
128
+ metrics:
129
+ - name: acc
130
+ type: acc
131
+ value: 0.610738255033557
132
+ verified: false
133
+ - task:
134
+ type: text-generation
135
+ name: text generation
136
+ dataset:
137
+ name: copa
138
+ type: copa
139
+ metrics:
140
+ - name: acc
141
+ type: acc
142
+ value: 0.63
143
+ verified: false
144
+ - task:
145
+ type: text-generation
146
+ name: text generation
147
+ dataset:
148
+ name: crows_pairs_english
149
+ type: crows_pairs_english
150
+ metrics:
151
+ - name: acc
152
+ type: acc
153
+ value: 0.4973166368515206
154
+ verified: false
155
+ - task:
156
+ type: text-generation
157
+ name: text generation
158
+ dataset:
159
+ name: crows_pairs_french
160
+ type: crows_pairs_french
161
+ metrics:
162
+ - name: acc
163
+ type: acc
164
+ value: 0.5032796660703638
165
+ verified: false
166
+ - task:
167
+ type: text-generation
168
+ name: text generation
169
+ dataset:
170
+ name: diabla
171
+ type: diabla
172
+ metrics:
173
+ - name: acc
174
+ type: acc
175
+ value: 0.28888308977035493
176
+ verified: false
177
+ - task:
178
+ type: text-generation
179
+ name: text generation
180
+ dataset:
181
+ name: gsarti/flores_101_afr
182
+ type: gsarti/flores_101_afr
183
+ metrics:
184
+ - name: byte_perplexity
185
+ type: byte_perplexity
186
+ value: 6.500798737976343
187
+ verified: false
188
+ - task:
189
+ type: text-generation
190
+ name: text generation
191
+ dataset:
192
+ name: gsarti/flores_101_amh
193
+ type: gsarti/flores_101_amh
194
+ metrics:
195
+ - name: byte_perplexity
196
+ type: byte_perplexity
197
+ value: 3.9726863338897145
198
+ verified: false
199
+ - task:
200
+ type: text-generation
201
+ name: text generation
202
+ dataset:
203
+ name: gsarti/flores_101_ara
204
+ type: gsarti/flores_101_ara
205
+ metrics:
206
+ - name: byte_perplexity
207
+ type: byte_perplexity
208
+ value: 1.8083841089875814
209
+ verified: false
210
+ - task:
211
+ type: text-generation
212
+ name: text generation
213
+ dataset:
214
+ name: gsarti/flores_101_asm
215
+ type: gsarti/flores_101_asm
216
+ metrics:
217
+ - name: byte_perplexity
218
+ type: byte_perplexity
219
+ value: 5.699102962086425
220
+ verified: false
221
+ - task:
222
+ type: text-generation
223
+ name: text generation
224
+ dataset:
225
+ name: gsarti/flores_101_ast
226
+ type: gsarti/flores_101_ast
227
+ metrics:
228
+ - name: byte_perplexity
229
+ type: byte_perplexity
230
+ value: 3.9252047073429384
231
+ verified: false
232
+ - task:
233
+ type: text-generation
234
+ name: text generation
235
+ dataset:
236
+ name: gsarti/flores_101_azj
237
+ type: gsarti/flores_101_azj
238
+ metrics:
239
+ - name: byte_perplexity
240
+ type: byte_perplexity
241
+ value: 6.942805054270002
242
+ verified: false
243
+ - task:
244
+ type: text-generation
245
+ name: text generation
246
+ dataset:
247
+ name: gsarti/flores_101_bel
248
+ type: gsarti/flores_101_bel
249
+ metrics:
250
+ - name: byte_perplexity
251
+ type: byte_perplexity
252
+ value: 3.614136245847082
253
+ verified: false
254
+ - task:
255
+ type: text-generation
256
+ name: text generation
257
+ dataset:
258
+ name: gsarti/flores_101_ben
259
+ type: gsarti/flores_101_ben
260
+ metrics:
261
+ - name: byte_perplexity
262
+ type: byte_perplexity
263
+ value: 5.121491534300969
264
+ verified: false
265
+ - task:
266
+ type: text-generation
267
+ name: text generation
268
+ dataset:
269
+ name: gsarti/flores_101_bos
270
+ type: gsarti/flores_101_bos
271
+ metrics:
272
+ - name: byte_perplexity
273
+ type: byte_perplexity
274
+ value: 5.653353469118798
275
+ verified: false
276
+ - task:
277
+ type: text-generation
278
+ name: text generation
279
+ dataset:
280
+ name: gsarti/flores_101_bul
281
+ type: gsarti/flores_101_bul
282
+ metrics:
283
+ - name: byte_perplexity
284
+ type: byte_perplexity
285
+ value: 2.7014693938055068
286
+ verified: false
287
+ - task:
288
+ type: text-generation
289
+ name: text generation
290
+ dataset:
291
+ name: gsarti/flores_101_cat
292
+ type: gsarti/flores_101_cat
293
+ metrics:
294
+ - name: byte_perplexity
295
+ type: byte_perplexity
296
+ value: 2.305190041967345
297
+ verified: false
298
+ - task:
299
+ type: text-generation
300
+ name: text generation
301
+ dataset:
302
+ name: gsarti/flores_101_ceb
303
+ type: gsarti/flores_101_ceb
304
+ metrics:
305
+ - name: byte_perplexity
306
+ type: byte_perplexity
307
+ value: 6.291000321323428
308
+ verified: false
309
+ - task:
310
+ type: text-generation
311
+ name: text generation
312
+ dataset:
313
+ name: gsarti/flores_101_ces
314
+ type: gsarti/flores_101_ces
315
+ metrics:
316
+ - name: byte_perplexity
317
+ type: byte_perplexity
318
+ value: 5.447322753586386
319
+ verified: false
320
+ - task:
321
+ type: text-generation
322
+ name: text generation
323
+ dataset:
324
+ name: gsarti/flores_101_ckb
325
+ type: gsarti/flores_101_ckb
326
+ metrics:
327
+ - name: byte_perplexity
328
+ type: byte_perplexity
329
+ value: 3.7255124939234765
330
+ verified: false
331
+ - task:
332
+ type: text-generation
333
+ name: text generation
334
+ dataset:
335
+ name: gsarti/flores_101_cym
336
+ type: gsarti/flores_101_cym
337
+ metrics:
338
+ - name: byte_perplexity
339
+ type: byte_perplexity
340
+ value: 12.539424151448149
341
+ verified: false
342
+ - task:
343
+ type: text-generation
344
+ name: text generation
345
+ dataset:
346
+ name: gsarti/flores_101_dan
347
+ type: gsarti/flores_101_dan
348
+ metrics:
349
+ - name: byte_perplexity
350
+ type: byte_perplexity
351
+ value: 5.183309001005672
352
+ verified: false
353
+ - task:
354
+ type: text-generation
355
+ name: text generation
356
+ dataset:
357
+ name: gsarti/flores_101_deu
358
+ type: gsarti/flores_101_deu
359
+ metrics:
360
+ - name: byte_perplexity
361
+ type: byte_perplexity
362
+ value: 3.1180422286591347
363
+ verified: false
364
+ - task:
365
+ type: text-generation
366
+ name: text generation
367
+ dataset:
368
+ name: gsarti/flores_101_ell
369
+ type: gsarti/flores_101_ell
370
+ metrics:
371
+ - name: byte_perplexity
372
+ type: byte_perplexity
373
+ value: 2.467943456164706
374
+ verified: false
375
+ - task:
376
+ type: text-generation
377
+ name: text generation
378
+ dataset:
379
+ name: gsarti/flores_101_eng
380
+ type: gsarti/flores_101_eng
381
+ metrics:
382
+ - name: byte_perplexity
383
+ type: byte_perplexity
384
+ value: 2.018740628193298
385
+ verified: false
386
+ - task:
387
+ type: text-generation
388
+ name: text generation
389
+ dataset:
390
+ name: gsarti/flores_101_est
391
+ type: gsarti/flores_101_est
392
+ metrics:
393
+ - name: byte_perplexity
394
+ type: byte_perplexity
395
+ value: 9.11654425176368
396
+ verified: false
397
+ - task:
398
+ type: text-generation
399
+ name: text generation
400
+ dataset:
401
+ name: gsarti/flores_101_fas
402
+ type: gsarti/flores_101_fas
403
+ metrics:
404
+ - name: byte_perplexity
405
+ type: byte_perplexity
406
+ value: 3.058009097116482
407
+ verified: false
408
+ - task:
409
+ type: text-generation
410
+ name: text generation
411
+ dataset:
412
+ name: gsarti/flores_101_fin
413
+ type: gsarti/flores_101_fin
414
+ metrics:
415
+ - name: byte_perplexity
416
+ type: byte_perplexity
417
+ value: 6.847047959628553
418
+ verified: false
419
+ - task:
420
+ type: text-generation
421
+ name: text generation
422
+ dataset:
423
+ name: gsarti/flores_101_fra
424
+ type: gsarti/flores_101_fra
425
+ metrics:
426
+ - name: byte_perplexity
427
+ type: byte_perplexity
428
+ value: 1.9975177011840075
429
+ verified: false
430
+ - task:
431
+ type: text-generation
432
+ name: text generation
433
+ dataset:
434
+ name: gsarti/flores_101_ful
435
+ type: gsarti/flores_101_ful
436
+ metrics:
437
+ - name: byte_perplexity
438
+ type: byte_perplexity
439
+ value: 11.465912731488828
440
+ verified: false
441
+ - task:
442
+ type: text-generation
443
+ name: text generation
444
+ dataset:
445
+ name: gsarti/flores_101_gle
446
+ type: gsarti/flores_101_gle
447
+ metrics:
448
+ - name: byte_perplexity
449
+ type: byte_perplexity
450
+ value: 8.681491663539422
451
+ verified: false
452
+ - task:
453
+ type: text-generation
454
+ name: text generation
455
+ dataset:
456
+ name: gsarti/flores_101_glg
457
+ type: gsarti/flores_101_glg
458
+ metrics:
459
+ - name: byte_perplexity
460
+ type: byte_perplexity
461
+ value: 3.029991089015508
462
+ verified: false
463
+ - task:
464
+ type: text-generation
465
+ name: text generation
466
+ dataset:
467
+ name: gsarti/flores_101_guj
468
+ type: gsarti/flores_101_guj
469
+ metrics:
470
+ - name: byte_perplexity
471
+ type: byte_perplexity
472
+ value: 4.955224230286231
473
+ verified: false
474
+ - task:
475
+ type: text-generation
476
+ name: text generation
477
+ dataset:
478
+ name: gsarti/flores_101_hau
479
+ type: gsarti/flores_101_hau
480
+ metrics:
481
+ - name: byte_perplexity
482
+ type: byte_perplexity
483
+ value: 10.758347356372159
484
+ verified: false
485
+ - task:
486
+ type: text-generation
487
+ name: text generation
488
+ dataset:
489
+ name: gsarti/flores_101_heb
490
+ type: gsarti/flores_101_heb
491
+ metrics:
492
+ - name: byte_perplexity
493
+ type: byte_perplexity
494
+ value: 3.6004478129801667
495
+ verified: false
496
+ - task:
497
+ type: text-generation
498
+ name: text generation
499
+ dataset:
500
+ name: gsarti/flores_101_hin
501
+ type: gsarti/flores_101_hin
502
+ metrics:
503
+ - name: byte_perplexity
504
+ type: byte_perplexity
505
+ value: 4.712530650588064
506
+ verified: false
507
+ - task:
508
+ type: text-generation
509
+ name: text generation
510
+ dataset:
511
+ name: gsarti/flores_101_hrv
512
+ type: gsarti/flores_101_hrv
513
+ metrics:
514
+ - name: byte_perplexity
515
+ type: byte_perplexity
516
+ value: 5.822418943372185
517
+ verified: false
518
+ - task:
519
+ type: text-generation
520
+ name: text generation
521
+ dataset:
522
+ name: gsarti/flores_101_hun
523
+ type: gsarti/flores_101_hun
524
+ metrics:
525
+ - name: byte_perplexity
526
+ type: byte_perplexity
527
+ value: 6.440482646965992
528
+ verified: false
529
+ - task:
530
+ type: text-generation
531
+ name: text generation
532
+ dataset:
533
+ name: gsarti/flores_101_hye
534
+ type: gsarti/flores_101_hye
535
+ metrics:
536
+ - name: byte_perplexity
537
+ type: byte_perplexity
538
+ value: 3.657718918347166
539
+ verified: false
540
+ - task:
541
+ type: text-generation
542
+ name: text generation
543
+ dataset:
544
+ name: gsarti/flores_101_ibo
545
+ type: gsarti/flores_101_ibo
546
+ metrics:
547
+ - name: byte_perplexity
548
+ type: byte_perplexity
549
+ value: 5.564814003872672
550
+ verified: false
551
+ - task:
552
+ type: text-generation
553
+ name: text generation
554
+ dataset:
555
+ name: gsarti/flores_101_ind
556
+ type: gsarti/flores_101_ind
557
+ metrics:
558
+ - name: byte_perplexity
559
+ type: byte_perplexity
560
+ value: 2.1597101468869373
561
+ verified: false
562
+ - task:
563
+ type: text-generation
564
+ name: text generation
565
+ dataset:
566
+ name: gsarti/flores_101_isl
567
+ type: gsarti/flores_101_isl
568
+ metrics:
569
+ - name: byte_perplexity
570
+ type: byte_perplexity
571
+ value: 8.082349269518136
572
+ verified: false
573
+ - task:
574
+ type: text-generation
575
+ name: text generation
576
+ dataset:
577
+ name: gsarti/flores_101_ita
578
+ type: gsarti/flores_101_ita
579
+ metrics:
580
+ - name: byte_perplexity
581
+ type: byte_perplexity
582
+ value: 2.9687591414176207
583
+ verified: false
584
+ - task:
585
+ type: text-generation
586
+ name: text generation
587
+ dataset:
588
+ name: gsarti/flores_101_jav
589
+ type: gsarti/flores_101_jav
590
+ metrics:
591
+ - name: byte_perplexity
592
+ type: byte_perplexity
593
+ value: 7.0573805415708994
594
+ verified: false
595
+ - task:
596
+ type: text-generation
597
+ name: text generation
598
+ dataset:
599
+ name: gsarti/flores_101_jpn
600
+ type: gsarti/flores_101_jpn
601
+ metrics:
602
+ - name: byte_perplexity
603
+ type: byte_perplexity
604
+ value: 2.7758864197116933
605
+ verified: false
606
+ - task:
607
+ type: text-generation
608
+ name: text generation
609
+ dataset:
610
+ name: gsarti/flores_101_kam
611
+ type: gsarti/flores_101_kam
612
+ metrics:
613
+ - name: byte_perplexity
614
+ type: byte_perplexity
615
+ value: 11.072949642861332
616
+ verified: false
617
+ - task:
618
+ type: text-generation
619
+ name: text generation
620
+ dataset:
621
+ name: gsarti/flores_101_kan
622
+ type: gsarti/flores_101_kan
623
+ metrics:
624
+ - name: byte_perplexity
625
+ type: byte_perplexity
626
+ value: 5.551730651007082
627
+ verified: false
628
+ - task:
629
+ type: text-generation
630
+ name: text generation
631
+ dataset:
632
+ name: gsarti/flores_101_kat
633
+ type: gsarti/flores_101_kat
634
+ metrics:
635
+ - name: byte_perplexity
636
+ type: byte_perplexity
637
+ value: 2.522630524283745
638
+ verified: false
639
+ - task:
640
+ type: text-generation
641
+ name: text generation
642
+ dataset:
643
+ name: gsarti/flores_101_kaz
644
+ type: gsarti/flores_101_kaz
645
+ metrics:
646
+ - name: byte_perplexity
647
+ type: byte_perplexity
648
+ value: 3.3901748516975574
649
+ verified: false
650
+ - task:
651
+ type: text-generation
652
+ name: text generation
653
+ dataset:
654
+ name: gsarti/flores_101_kea
655
+ type: gsarti/flores_101_kea
656
+ metrics:
657
+ - name: byte_perplexity
658
+ type: byte_perplexity
659
+ value: 8.918534182590863
660
+ verified: false
661
+ - task:
662
+ type: text-generation
663
+ name: text generation
664
+ dataset:
665
+ name: gsarti/flores_101_kir
666
+ type: gsarti/flores_101_kir
667
+ metrics:
668
+ - name: byte_perplexity
669
+ type: byte_perplexity
670
+ value: 3.729278369847201
671
+ verified: false
672
+ - task:
673
+ type: text-generation
674
+ name: text generation
675
+ dataset:
676
+ name: gsarti/flores_101_kor
677
+ type: gsarti/flores_101_kor
678
+ metrics:
679
+ - name: byte_perplexity
680
+ type: byte_perplexity
681
+ value: 3.932884847226212
682
+ verified: false
683
+ - task:
684
+ type: text-generation
685
+ name: text generation
686
+ dataset:
687
+ name: gsarti/flores_101_lao
688
+ type: gsarti/flores_101_lao
689
+ metrics:
690
+ - name: byte_perplexity
691
+ type: byte_perplexity
692
+ value: 2.9077314760849924
693
+ verified: false
694
+ - task:
695
+ type: text-generation
696
+ name: text generation
697
+ dataset:
698
+ name: gsarti/flores_101_lav
699
+ type: gsarti/flores_101_lav
700
+ metrics:
701
+ - name: byte_perplexity
702
+ type: byte_perplexity
703
+ value: 7.777221919194806
704
+ verified: false
705
+ - task:
706
+ type: text-generation
707
+ name: text generation
708
+ dataset:
709
+ name: gsarti/flores_101_lin
710
+ type: gsarti/flores_101_lin
711
+ metrics:
712
+ - name: byte_perplexity
713
+ type: byte_perplexity
714
+ value: 7.524842908050988
715
+ verified: false
716
+ - task:
717
+ type: text-generation
718
+ name: text generation
719
+ dataset:
720
+ name: gsarti/flores_101_lit
721
+ type: gsarti/flores_101_lit
722
+ metrics:
723
+ - name: byte_perplexity
724
+ type: byte_perplexity
725
+ value: 7.369179434621725
726
+ verified: false
727
+ - task:
728
+ type: text-generation
729
+ name: text generation
730
+ dataset:
731
+ name: gsarti/flores_101_ltz
732
+ type: gsarti/flores_101_ltz
733
+ metrics:
734
+ - name: byte_perplexity
735
+ type: byte_perplexity
736
+ value: 8.801059747949214
737
+ verified: false
738
+ - task:
739
+ type: text-generation
740
+ name: text generation
741
+ dataset:
742
+ name: gsarti/flores_101_lug
743
+ type: gsarti/flores_101_lug
744
+ metrics:
745
+ - name: byte_perplexity
746
+ type: byte_perplexity
747
+ value: 8.483203026364786
748
+ verified: false
749
+ - task:
750
+ type: text-generation
751
+ name: text generation
752
+ dataset:
753
+ name: gsarti/flores_101_luo
754
+ type: gsarti/flores_101_luo
755
+ metrics:
756
+ - name: byte_perplexity
757
+ type: byte_perplexity
758
+ value: 11.975963093623681
759
+ verified: false
760
+ - task:
761
+ type: text-generation
762
+ name: text generation
763
+ dataset:
764
+ name: gsarti/flores_101_mal
765
+ type: gsarti/flores_101_mal
766
+ metrics:
767
+ - name: byte_perplexity
768
+ type: byte_perplexity
769
+ value: 4.615948455160037
770
+ verified: false
771
+ - task:
772
+ type: text-generation
773
+ name: text generation
774
+ dataset:
775
+ name: gsarti/flores_101_mar
776
+ type: gsarti/flores_101_mar
777
+ metrics:
778
+ - name: byte_perplexity
779
+ type: byte_perplexity
780
+ value: 5.483253482821379
781
+ verified: false
782
+ - task:
783
+ type: text-generation
784
+ name: text generation
785
+ dataset:
786
+ name: gsarti/flores_101_mkd
787
+ type: gsarti/flores_101_mkd
788
+ metrics:
789
+ - name: byte_perplexity
790
+ type: byte_perplexity
791
+ value: 2.9656732291754087
792
+ verified: false
793
+ - task:
794
+ type: text-generation
795
+ name: text generation
796
+ dataset:
797
+ name: gsarti/flores_101_mlt
798
+ type: gsarti/flores_101_mlt
799
+ metrics:
800
+ - name: byte_perplexity
801
+ type: byte_perplexity
802
+ value: 15.004773437665275
803
+ verified: false
804
+ - task:
805
+ type: text-generation
806
+ name: text generation
807
+ dataset:
808
+ name: gsarti/flores_101_mon
809
+ type: gsarti/flores_101_mon
810
+ metrics:
811
+ - name: byte_perplexity
812
+ type: byte_perplexity
813
+ value: 3.410598542315402
814
+ verified: false
815
+ - task:
816
+ type: text-generation
817
+ name: text generation
818
+ dataset:
819
+ name: gsarti/flores_101_mri
820
+ type: gsarti/flores_101_mri
821
+ metrics:
822
+ - name: byte_perplexity
823
+ type: byte_perplexity
824
+ value: 7.474035895661322
825
+ verified: false
826
+ - task:
827
+ type: text-generation
828
+ name: text generation
829
+ dataset:
830
+ name: gsarti/flores_101_msa
831
+ type: gsarti/flores_101_msa
832
+ metrics:
833
+ - name: byte_perplexity
834
+ type: byte_perplexity
835
+ value: 2.5710001772665634
836
+ verified: false
837
+ - task:
838
+ type: text-generation
839
+ name: text generation
840
+ dataset:
841
+ name: gsarti/flores_101_mya
842
+ type: gsarti/flores_101_mya
843
+ metrics:
844
+ - name: byte_perplexity
845
+ type: byte_perplexity
846
+ value: 2.413577969878331
847
+ verified: false
848
+ - task:
849
+ type: text-generation
850
+ name: text generation
851
+ dataset:
852
+ name: gsarti/flores_101_nld
853
+ type: gsarti/flores_101_nld
854
+ metrics:
855
+ - name: byte_perplexity
856
+ type: byte_perplexity
857
+ value: 4.127831721885065
858
+ verified: false
859
+ - task:
860
+ type: text-generation
861
+ name: text generation
862
+ dataset:
863
+ name: gsarti/flores_101_nob
864
+ type: gsarti/flores_101_nob
865
+ metrics:
866
+ - name: byte_perplexity
867
+ type: byte_perplexity
868
+ value: 5.402763169129877
869
+ verified: false
870
+ - task:
871
+ type: text-generation
872
+ name: text generation
873
+ dataset:
874
+ name: gsarti/flores_101_npi
875
+ type: gsarti/flores_101_npi
876
+ metrics:
877
+ - name: byte_perplexity
878
+ type: byte_perplexity
879
+ value: 5.199342701937889
880
+ verified: false
881
+ - task:
882
+ type: text-generation
883
+ name: text generation
884
+ dataset:
885
+ name: gsarti/flores_101_nso
886
+ type: gsarti/flores_101_nso
887
+ metrics:
888
+ - name: byte_perplexity
889
+ type: byte_perplexity
890
+ value: 8.154626800955667
891
+ verified: false
892
+ - task:
893
+ type: text-generation
894
+ name: text generation
895
+ dataset:
896
+ name: gsarti/flores_101_nya
897
+ type: gsarti/flores_101_nya
898
+ metrics:
899
+ - name: byte_perplexity
900
+ type: byte_perplexity
901
+ value: 8.179860208369393
902
+ verified: false
903
+ - task:
904
+ type: text-generation
905
+ name: text generation
906
+ dataset:
907
+ name: gsarti/flores_101_oci
908
+ type: gsarti/flores_101_oci
909
+ metrics:
910
+ - name: byte_perplexity
911
+ type: byte_perplexity
912
+ value: 4.8617357393685845
913
+ verified: false
914
+ - task:
915
+ type: text-generation
916
+ name: text generation
917
+ dataset:
918
+ name: gsarti/flores_101_orm
919
+ type: gsarti/flores_101_orm
920
+ metrics:
921
+ - name: byte_perplexity
922
+ type: byte_perplexity
923
+ value: 12.911595421079408
924
+ verified: false
925
+ - task:
926
+ type: text-generation
927
+ name: text generation
928
+ dataset:
929
+ name: gsarti/flores_101_ory
930
+ type: gsarti/flores_101_ory
931
+ metrics:
932
+ - name: byte_perplexity
933
+ type: byte_perplexity
934
+ value: 5.189421861225964
935
+ verified: false
936
+ - task:
937
+ type: text-generation
938
+ name: text generation
939
+ dataset:
940
+ name: gsarti/flores_101_pan
941
+ type: gsarti/flores_101_pan
942
+ metrics:
943
+ - name: byte_perplexity
944
+ type: byte_perplexity
945
+ value: 4.698477289331806
946
+ verified: false
947
+ - task:
948
+ type: text-generation
949
+ name: text generation
950
+ dataset:
951
+ name: gsarti/flores_101_pol
952
+ type: gsarti/flores_101_pol
953
+ metrics:
954
+ - name: byte_perplexity
955
+ type: byte_perplexity
956
+ value: 4.625550458479643
957
+ verified: false
958
+ - task:
959
+ type: text-generation
960
+ name: text generation
961
+ dataset:
962
+ name: gsarti/flores_101_por
963
+ type: gsarti/flores_101_por
964
+ metrics:
965
+ - name: byte_perplexity
966
+ type: byte_perplexity
967
+ value: 1.9754515986213523
968
+ verified: false
969
+ - task:
970
+ type: text-generation
971
+ name: text generation
972
+ dataset:
973
+ name: gsarti/flores_101_pus
974
+ type: gsarti/flores_101_pus
975
+ metrics:
976
+ - name: byte_perplexity
977
+ type: byte_perplexity
978
+ value: 4.4963371422771585
979
+ verified: false
980
+ - task:
981
+ type: text-generation
982
+ name: text generation
983
+ dataset:
984
+ name: gsarti/flores_101_ron
985
+ type: gsarti/flores_101_ron
986
+ metrics:
987
+ - name: byte_perplexity
988
+ type: byte_perplexity
989
+ value: 4.965456830031304
990
+ verified: false
991
+ - task:
992
+ type: text-generation
993
+ name: text generation
994
+ dataset:
995
+ name: gsarti/flores_101_rus
996
+ type: gsarti/flores_101_rus
997
+ metrics:
998
+ - name: byte_perplexity
999
+ type: byte_perplexity
1000
+ value: 2.0498020542445303
1001
+ verified: false
1002
+ - task:
1003
+ type: text-generation
1004
+ name: text generation
1005
+ dataset:
1006
+ name: gsarti/flores_101_slk
1007
+ type: gsarti/flores_101_slk
1008
+ metrics:
1009
+ - name: byte_perplexity
1010
+ type: byte_perplexity
1011
+ value: 6.450822127057479
1012
+ verified: false
1013
+ - task:
1014
+ type: text-generation
1015
+ name: text generation
1016
+ dataset:
1017
+ name: gsarti/flores_101_slv
1018
+ type: gsarti/flores_101_slv
1019
+ metrics:
1020
+ - name: byte_perplexity
1021
+ type: byte_perplexity
1022
+ value: 6.620252120186232
1023
+ verified: false
1024
+ - task:
1025
+ type: text-generation
1026
+ name: text generation
1027
+ dataset:
1028
+ name: gsarti/flores_101_sna
1029
+ type: gsarti/flores_101_sna
1030
+ metrics:
1031
+ - name: byte_perplexity
1032
+ type: byte_perplexity
1033
+ value: 8.462166771382726
1034
+ verified: false
1035
+ - task:
1036
+ type: text-generation
1037
+ name: text generation
1038
+ dataset:
1039
+ name: gsarti/flores_101_snd
1040
+ type: gsarti/flores_101_snd
1041
+ metrics:
1042
+ - name: byte_perplexity
1043
+ type: byte_perplexity
1044
+ value: 5.466066951221973
1045
+ verified: false
1046
+ - task:
1047
+ type: text-generation
1048
+ name: text generation
1049
+ dataset:
1050
+ name: gsarti/flores_101_som
1051
+ type: gsarti/flores_101_som
1052
+ metrics:
1053
+ - name: byte_perplexity
1054
+ type: byte_perplexity
1055
+ value: 11.95918054093392
1056
+ verified: false
1057
+ - task:
1058
+ type: text-generation
1059
+ name: text generation
1060
+ dataset:
1061
+ name: gsarti/flores_101_spa
1062
+ type: gsarti/flores_101_spa
1063
+ metrics:
1064
+ - name: byte_perplexity
1065
+ type: byte_perplexity
1066
+ value: 1.8965140104323535
1067
+ verified: false
1068
+ - task:
1069
+ type: text-generation
1070
+ name: text generation
1071
+ dataset:
1072
+ name: gsarti/flores_101_srp
1073
+ type: gsarti/flores_101_srp
1074
+ metrics:
1075
+ - name: byte_perplexity
1076
+ type: byte_perplexity
1077
+ value: 2.871214785885079
1078
+ verified: false
1079
+ - task:
1080
+ type: text-generation
1081
+ name: text generation
1082
+ dataset:
1083
+ name: gsarti/flores_101_swe
1084
+ type: gsarti/flores_101_swe
1085
+ metrics:
1086
+ - name: byte_perplexity
1087
+ type: byte_perplexity
1088
+ value: 5.054972008155866
1089
+ verified: false
1090
+ - task:
1091
+ type: text-generation
1092
+ name: text generation
1093
+ dataset:
1094
+ name: gsarti/flores_101_swh
1095
+ type: gsarti/flores_101_swh
1096
+ metrics:
1097
+ - name: byte_perplexity
1098
+ type: byte_perplexity
1099
+ value: 3.6973091886730676
1100
+ verified: false
1101
+ - task:
1102
+ type: text-generation
1103
+ name: text generation
1104
+ dataset:
1105
+ name: gsarti/flores_101_tam
1106
+ type: gsarti/flores_101_tam
1107
+ metrics:
1108
+ - name: byte_perplexity
1109
+ type: byte_perplexity
1110
+ value: 4.539493400469833
1111
+ verified: false
1112
+ - task:
1113
+ type: text-generation
1114
+ name: text generation
1115
+ dataset:
1116
+ name: gsarti/flores_101_tel
1117
+ type: gsarti/flores_101_tel
1118
+ metrics:
1119
+ - name: byte_perplexity
1120
+ type: byte_perplexity
1121
+ value: 5.807499987508966
1122
+ verified: false
1123
+ - task:
1124
+ type: text-generation
1125
+ name: text generation
1126
+ dataset:
1127
+ name: gsarti/flores_101_tgk
1128
+ type: gsarti/flores_101_tgk
1129
+ metrics:
1130
+ - name: byte_perplexity
1131
+ type: byte_perplexity
1132
+ value: 3.5994818827380426
1133
+ verified: false
1134
+ - task:
1135
+ type: text-generation
1136
+ name: text generation
1137
+ dataset:
1138
+ name: gsarti/flores_101_tgl
1139
+ type: gsarti/flores_101_tgl
1140
+ metrics:
1141
+ - name: byte_perplexity
1142
+ type: byte_perplexity
1143
+ value: 5.667053833119858
1144
+ verified: false
1145
+ - task:
1146
+ type: text-generation
1147
+ name: text generation
1148
+ dataset:
1149
+ name: gsarti/flores_101_tha
1150
+ type: gsarti/flores_101_tha
1151
+ metrics:
1152
+ - name: byte_perplexity
1153
+ type: byte_perplexity
1154
+ value: 2.365940201944242
1155
+ verified: false
1156
+ - task:
1157
+ type: text-generation
1158
+ name: text generation
1159
+ dataset:
1160
+ name: gsarti/flores_101_tur
1161
+ type: gsarti/flores_101_tur
1162
+ metrics:
1163
+ - name: byte_perplexity
1164
+ type: byte_perplexity
1165
+ value: 4.885014749844601
1166
+ verified: false
1167
+ - task:
1168
+ type: text-generation
1169
+ name: text generation
1170
+ dataset:
1171
+ name: gsarti/flores_101_ukr
1172
+ type: gsarti/flores_101_ukr
1173
+ metrics:
1174
+ - name: byte_perplexity
1175
+ type: byte_perplexity
1176
+ value: 2.7240934990288483
1177
+ verified: false
1178
+ - task:
1179
+ type: text-generation
1180
+ name: text generation
1181
+ dataset:
1182
+ name: gsarti/flores_101_umb
1183
+ type: gsarti/flores_101_umb
1184
+ metrics:
1185
+ - name: byte_perplexity
1186
+ type: byte_perplexity
1187
+ value: 12.766915508610673
1188
+ verified: false
1189
+ - task:
1190
+ type: text-generation
1191
+ name: text generation
1192
+ dataset:
1193
+ name: gsarti/flores_101_urd
1194
+ type: gsarti/flores_101_urd
1195
+ metrics:
1196
+ - name: byte_perplexity
1197
+ type: byte_perplexity
1198
+ value: 1.9797467071381232
1199
+ verified: false
1200
+ - task:
1201
+ type: text-generation
1202
+ name: text generation
1203
+ dataset:
1204
+ name: gsarti/flores_101_uzb
1205
+ type: gsarti/flores_101_uzb
1206
+ metrics:
1207
+ - name: byte_perplexity
1208
+ type: byte_perplexity
1209
+ value: 12.002337637722146
1210
+ verified: false
1211
+ - task:
1212
+ type: text-generation
1213
+ name: text generation
1214
+ dataset:
1215
+ name: gsarti/flores_101_vie
1216
+ type: gsarti/flores_101_vie
1217
+ metrics:
1218
+ - name: byte_perplexity
1219
+ type: byte_perplexity
1220
+ value: 1.76578415476397
1221
+ verified: false
1222
+ - task:
1223
+ type: text-generation
1224
+ name: text generation
1225
+ dataset:
1226
+ name: gsarti/flores_101_wol
1227
+ type: gsarti/flores_101_wol
1228
+ metrics:
1229
+ - name: byte_perplexity
1230
+ type: byte_perplexity
1231
+ value: 9.144285650306488
1232
+ verified: false
1233
+ - task:
1234
+ type: text-generation
1235
+ name: text generation
1236
+ dataset:
1237
+ name: gsarti/flores_101_xho
1238
+ type: gsarti/flores_101_xho
1239
+ metrics:
1240
+ - name: byte_perplexity
1241
+ type: byte_perplexity
1242
+ value: 7.403240538286952
1243
+ verified: false
1244
+ - task:
1245
+ type: text-generation
1246
+ name: text generation
1247
+ dataset:
1248
+ name: gsarti/flores_101_yor
1249
+ type: gsarti/flores_101_yor
1250
+ metrics:
1251
+ - name: byte_perplexity
1252
+ type: byte_perplexity
1253
+ value: 5.91272037551173
1254
+ verified: false
1255
+ - task:
1256
+ type: text-generation
1257
+ name: text generation
1258
+ dataset:
1259
+ name: gsarti/flores_101_zho_simpl
1260
+ type: gsarti/flores_101_zho_simpl
1261
+ metrics:
1262
+ - name: byte_perplexity
1263
+ type: byte_perplexity
1264
+ value: 2.2769070822768533
1265
+ verified: false
1266
+ - task:
1267
+ type: text-generation
1268
+ name: text generation
1269
+ dataset:
1270
+ name: gsarti/flores_101_zho_trad
1271
+ type: gsarti/flores_101_zho_trad
1272
+ metrics:
1273
+ - name: byte_perplexity
1274
+ type: byte_perplexity
1275
+ value: 2.5180582198242383
1276
+ verified: false
1277
+ - task:
1278
+ type: text-generation
1279
+ name: text generation
1280
+ dataset:
1281
+ name: gsarti/flores_101_zul
1282
+ type: gsarti/flores_101_zul
1283
+ metrics:
1284
+ - name: byte_perplexity
1285
+ type: byte_perplexity
1286
+ value: 8.53353320693145
1287
+ verified: false
1288
+ - task:
1289
+ type: text-generation
1290
+ name: text generation
1291
+ dataset:
1292
+ name: headqa
1293
+ type: headqa
1294
+ metrics:
1295
+ - name: acc
1296
+ type: acc
1297
+ value: 0.26440554339897887
1298
+ verified: false
1299
+ - task:
1300
+ type: text-generation
1301
+ name: text generation
1302
+ dataset:
1303
+ name: hellaswag
1304
+ type: hellaswag
1305
+ metrics:
1306
+ - name: acc
1307
+ type: acc
1308
+ value: 0.41236805417247563
1309
+ verified: false
1310
+ - task:
1311
+ type: text-generation
1312
+ name: text generation
1313
+ dataset:
1314
+ name: logiqa
1315
+ type: logiqa
1316
+ metrics:
1317
+ - name: acc
1318
+ type: acc
1319
+ value: 0.2073732718894009
1320
+ verified: false
1321
+ - task:
1322
+ type: text-generation
1323
+ name: text generation
1324
+ dataset:
1325
+ name: mathqa
1326
+ type: mathqa
1327
+ metrics:
1328
+ - name: acc
1329
+ type: acc
1330
+ value: 0.24958123953098826
1331
+ verified: false
1332
+ - task:
1333
+ type: text-generation
1334
+ name: text generation
1335
+ dataset:
1336
+ name: mc_taco
1337
+ type: mc_taco
1338
+ metrics:
1339
+ - name: em
1340
+ type: em
1341
+ value: 0.11936936936936937
1342
+ verified: false
1343
+ - task:
1344
+ type: text-generation
1345
+ name: text generation
1346
+ dataset:
1347
+ name: mnli
1348
+ type: mnli
1349
+ metrics:
1350
+ - name: acc
1351
+ type: acc
1352
+ value: 0.35496688741721855
1353
+ verified: false
1354
+ - task:
1355
+ type: text-generation
1356
+ name: text generation
1357
+ dataset:
1358
+ name: mnli_mismatched
1359
+ type: mnli_mismatched
1360
+ metrics:
1361
+ - name: acc
1362
+ type: acc
1363
+ value: 0.35211554109031734
1364
+ verified: false
1365
+ - task:
1366
+ type: text-generation
1367
+ name: text generation
1368
+ dataset:
1369
+ name: mrpc
1370
+ type: mrpc
1371
+ metrics:
1372
+ - name: acc
1373
+ type: acc
1374
+ value: 0.5857843137254902
1375
+ verified: false
1376
+ - task:
1377
+ type: text-generation
1378
+ name: text generation
1379
+ dataset:
1380
+ name: multirc
1381
+ type: multirc
1382
+ metrics:
1383
+ - name: acc
1384
+ type: acc
1385
+ value: 0.5375412541254125
1386
+ verified: false
1387
+ - task:
1388
+ type: text-generation
1389
+ name: text generation
1390
+ dataset:
1391
+ name: openbookqa
1392
+ type: openbookqa
1393
+ metrics:
1394
+ - name: acc
1395
+ type: acc
1396
+ value: 0.216
1397
+ verified: false
1398
+ - task:
1399
+ type: text-generation
1400
+ name: text generation
1401
+ dataset:
1402
+ name: piqa
1403
+ type: piqa
1404
+ metrics:
1405
+ - name: acc
1406
+ type: acc
1407
+ value: 0.7078346028291621
1408
+ verified: false
1409
+ - task:
1410
+ type: text-generation
1411
+ name: text generation
1412
+ dataset:
1413
+ name: prost
1414
+ type: prost
1415
+ metrics:
1416
+ - name: acc
1417
+ type: acc
1418
+ value: 0.22683603757472245
1419
+ verified: false
1420
+ - task:
1421
+ type: text-generation
1422
+ name: text generation
1423
+ dataset:
1424
+ name: pubmedqa
1425
+ type: pubmedqa
1426
+ metrics:
1427
+ - name: acc
1428
+ type: acc
1429
+ value: 0.616
1430
+ verified: false
1431
+ - task:
1432
+ type: text-generation
1433
+ name: text generation
1434
+ dataset:
1435
+ name: qnli
1436
+ type: qnli
1437
+ metrics:
1438
+ - name: acc
1439
+ type: acc
1440
+ value: 0.5072304594545122
1441
+ verified: false
1442
+ - task:
1443
+ type: text-generation
1444
+ name: text generation
1445
+ dataset:
1446
+ name: qqp
1447
+ type: qqp
1448
+ metrics:
1449
+ - name: acc
1450
+ type: acc
1451
+ value: 0.3842443729903537
1452
+ verified: false
1453
+ - task:
1454
+ type: text-generation
1455
+ name: text generation
1456
+ dataset:
1457
+ name: race
1458
+ type: race
1459
+ metrics:
1460
+ - name: acc
1461
+ type: acc
1462
+ value: 0.3521531100478469
1463
+ verified: false
1464
+ - task:
1465
+ type: text-generation
1466
+ name: text generation
1467
+ dataset:
1468
+ name: rte
1469
+ type: rte
1470
+ metrics:
1471
+ - name: acc
1472
+ type: acc
1473
+ value: 0.47653429602888087
1474
+ verified: false
1475
+ - task:
1476
+ type: text-generation
1477
+ name: text generation
1478
+ dataset:
1479
+ name: sciq
1480
+ type: sciq
1481
+ metrics:
1482
+ - name: acc
1483
+ type: acc
1484
+ value: 0.892
1485
+ verified: false
1486
+ - task:
1487
+ type: text-generation
1488
+ name: text generation
1489
+ dataset:
1490
+ name: sst
1491
+ type: sst
1492
+ metrics:
1493
+ - name: acc
1494
+ type: acc
1495
+ value: 0.5177752293577982
1496
+ verified: false
1497
+ - task:
1498
+ type: text-generation
1499
+ name: text generation
1500
+ dataset:
1501
+ name: triviaqa
1502
+ type: triviaqa
1503
+ metrics:
1504
+ - name: acc
1505
+ type: acc
1506
+ value: 0.041633518960487934
1507
+ verified: false
1508
+ - task:
1509
+ type: text-generation
1510
+ name: text generation
1511
+ dataset:
1512
+ name: tydiqa_primary
1513
+ type: tydiqa_primary
1514
+ metrics:
1515
+ - name: acc
1516
+ type: acc
1517
+ value: 0.3011337608795236
1518
+ verified: false
1519
+ - task:
1520
+ type: text-generation
1521
+ name: text generation
1522
+ dataset:
1523
+ name: webqs
1524
+ type: webqs
1525
+ metrics:
1526
+ - name: acc
1527
+ type: acc
1528
+ value: 0.01673228346456693
1529
+ verified: false
1530
+ - task:
1531
+ type: text-generation
1532
+ name: text generation
1533
+ dataset:
1534
+ name: wic
1535
+ type: wic
1536
+ metrics:
1537
+ - name: acc
1538
+ type: acc
1539
+ value: 0.5015673981191222
1540
+ verified: false
1541
+ - task:
1542
+ type: text-generation
1543
+ name: text generation
1544
+ dataset:
1545
+ name: winogrande
1546
+ type: winogrande
1547
+ metrics:
1548
+ - name: acc
1549
+ type: acc
1550
+ value: 0.5864246250986582
1551
+ verified: false
1552
+ - task:
1553
+ type: text-generation
1554
+ name: text generation
1555
+ dataset:
1556
+ name: wnli
1557
+ type: wnli
1558
+ metrics:
1559
+ - name: acc
1560
+ type: acc
1561
+ value: 0.471830985915493
1562
+ verified: false
1563
+ - task:
1564
+ type: text-generation
1565
+ name: text generation
1566
+ dataset:
1567
+ name: wsc
1568
+ type: wsc
1569
+ metrics:
1570
+ - name: acc
1571
+ type: acc
1572
+ value: 0.4423076923076923
1573
+ verified: false
1574
+ - task:
1575
+ type: text-generation
1576
+ name: text generation
1577
+ dataset:
1578
+ name: humaneval
1579
+ type: humaneval
1580
+ metrics:
1581
+ - name: pass@1
1582
+ type: pass@1
1583
+ value: 0.15524390243902436
1584
+ verified: false
1585
+ - name: pass@10
1586
+ type: pass@10
1587
+ value: 0.3220367632383857
1588
+ verified: false
1589
+ - name: pass@100
1590
+ type: pass@100
1591
+ value: 0.5545431515723145
1592
+ verified: false
1593
  ---
1594
 
1595
  <h1 style='text-align: center '>BLOOM LM</h1>
 
1993
  And multiple different metrics for specific tasks. _(More evaluation metrics forthcoming upon completion of evaluation protocol.)_
1994
 
1995
  ### Factors
1996
+ *This section lists some different aspects of BLOOM models. Its focus is on aspects that are likely to give rise to high variance in model behavior.*
1997
 
1998
  - Language, such as English or Yoruba
1999
 
 
2004
  ### Results
2005
  *Results are based on the [Factors](#factors) and [Metrics](#metrics).*
2006
 
2007
+ **Zero-shot evaluations:**
2008
+
2009
+ See this repository for JSON files: https://github.com/bigscience-workshop/evaluation-results
2010
+
2011
+ | Task | Language | Metric | BLOOM-2B5 |
2012
+ |:----|:----|:----|:----:|
2013
+ | arc_challenge | eng | acc ↑ | 0.28 |
2014
+ | arc_easy | eng | acc ↑ | 0.595 |
2015
+ | axb (Median of 10 prompts) | eng | acc ↑ | 0.443 |
2016
+ | axg (Median of 10 prompts) | eng | acc ↑ | 0.5 |
2017
+ | boolq (Median of 11 prompts) | eng | acc ↑ | 0.617 |
2018
+ | cb (Median of 15 prompts) | eng | acc ↑ | 0.304 |
2019
+ | cola (Median of 5 prompts) | eng | acc ↑ | 0.611 |
2020
+ | copa (Median of 9 prompts) | eng | acc ↑ | 0.63 |
2021
+ | crows_pairs_english (Median of 6 prompts) | eng | acc ↑ | 0.497 |
2022
+ | crows_pairs_french (Median of 7 prompts) | fra | acc ↑ | 0.503 |
2023
+ | diabla (Median of 2 prompts) | eng | acc ↑ | 0.289 |
2024
+ | gsarti/flores_101_afr | afr | byte_perplexity ↓ | 6.501 |
2025
+ | gsarti/flores_101_amh | amh | byte_perplexity ↓ | 3.973 |
2026
+ | gsarti/flores_101_ara | ara | byte_perplexity ↓ | 1.808 |
2027
+ | gsarti/flores_101_asm | asm | byte_perplexity ↓ | 5.699 |
2028
+ | gsarti/flores_101_ast | ast | byte_perplexity ↓ | 3.925 |
2029
+ | gsarti/flores_101_azj | azj | byte_perplexity ↓ | 6.943 |
2030
+ | gsarti/flores_101_bel | bel | byte_perplexity ↓ | 3.614 |
2031
+ | gsarti/flores_101_ben | ben | byte_perplexity ↓ | 5.121 |
2032
+ | gsarti/flores_101_bos | bos | byte_perplexity ↓ | 5.653 |
2033
+ | gsarti/flores_101_bul | bul | byte_perplexity ↓ | 2.701 |
2034
+ | gsarti/flores_101_cat | cat | byte_perplexity ↓ | 2.305 |
2035
+ | gsarti/flores_101_ceb | ceb | byte_perplexity ↓ | 6.291 |
2036
+ | gsarti/flores_101_ces | ces | byte_perplexity ↓ | 5.447 |
2037
+ | gsarti/flores_101_ckb | ckb | byte_perplexity ↓ | 3.726 |
2038
+ | gsarti/flores_101_cym | cym | byte_perplexity ↓ | 12.539 |
2039
+ | gsarti/flores_101_dan | dan | byte_perplexity ↓ | 5.183 |
2040
+ | gsarti/flores_101_deu | deu | byte_perplexity ↓ | 3.118 |
2041
+ | gsarti/flores_101_ell | ell | byte_perplexity ↓ | 2.468 |
2042
+ | gsarti/flores_101_eng | eng | byte_perplexity ↓ | 2.019 |
2043
+ | gsarti/flores_101_est | est | byte_perplexity ↓ | 9.117 |
2044
+ | gsarti/flores_101_fas | fas | byte_perplexity ↓ | 3.058 |
2045
+ | gsarti/flores_101_fin | fin | byte_perplexity ↓ | 6.847 |
2046
+ | gsarti/flores_101_fra | fra | byte_perplexity ↓ | 1.998 |
2047
+ | gsarti/flores_101_ful | ful | byte_perplexity ↓ | 11.466 |
2048
+ | gsarti/flores_101_gle | gle | byte_perplexity ↓ | 8.681 |
2049
+ | gsarti/flores_101_glg | glg | byte_perplexity ↓ | 3.03 |
2050
+ | gsarti/flores_101_guj | guj | byte_perplexity ↓ | 4.955 |
2051
+ | gsarti/flores_101_hau | hau | byte_perplexity ↓ | 10.758 |
2052
+ | gsarti/flores_101_heb | heb | byte_perplexity ↓ | 3.6 |
2053
+ | gsarti/flores_101_hin | hin | byte_perplexity ↓ | 4.713 |
2054
+ | gsarti/flores_101_hrv | hrv | byte_perplexity ↓ | 5.822 |
2055
+ | gsarti/flores_101_hun | hun | byte_perplexity ↓ | 6.44 |
2056
+ | gsarti/flores_101_hye | hye | byte_perplexity ↓ | 3.658 |
2057
+ | gsarti/flores_101_ibo | ibo | byte_perplexity ↓ | 5.565 |
2058
+ | gsarti/flores_101_ind | ind | byte_perplexity ↓ | 2.16 |
2059
+ | gsarti/flores_101_isl | isl | byte_perplexity ↓ | 8.082 |
2060
+ | gsarti/flores_101_ita | ita | byte_perplexity ↓ | 2.969 |
2061
+ | gsarti/flores_101_jav | jav | byte_perplexity ↓ | 7.057 |
2062
+ | gsarti/flores_101_jpn | jpn | byte_perplexity ↓ | 2.776 |
2063
+ | gsarti/flores_101_kam | kam | byte_perplexity ↓ | 11.073 |
2064
+ | gsarti/flores_101_kan | kan | byte_perplexity ↓ | 5.552 |
2065
+ | gsarti/flores_101_kat | kat | byte_perplexity ↓ | 2.523 |
2066
+ | gsarti/flores_101_kaz | kaz | byte_perplexity ↓ | 3.39 |
2067
+ | gsarti/flores_101_kea | kea | byte_perplexity ↓ | 8.919 |
2068
+ | gsarti/flores_101_kir | kir | byte_perplexity ↓ | 3.729 |
2069
+ | gsarti/flores_101_kor | kor | byte_perplexity ↓ | 3.933 |
2070
+ | gsarti/flores_101_lao | lao | byte_perplexity ↓ | 2.908 |
2071
+ | gsarti/flores_101_lav | lav | byte_perplexity ↓ | 7.777 |
2072
+ | gsarti/flores_101_lin | lin | byte_perplexity ↓ | 7.525 |
2073
+ | gsarti/flores_101_lit | lit | byte_perplexity ↓ | 7.369 |
2074
+ | gsarti/flores_101_ltz | ltz | byte_perplexity ↓ | 8.801 |
2075
+ | gsarti/flores_101_lug | lug | byte_perplexity ↓ | 8.483 |
2076
+ | gsarti/flores_101_luo | luo | byte_perplexity ↓ | 11.976 |
2077
+ | gsarti/flores_101_mal | mal | byte_perplexity ↓ | 4.616 |
2078
+ | gsarti/flores_101_mar | mar | byte_perplexity ↓ | 5.483 |
2079
+ | gsarti/flores_101_mkd | mkd | byte_perplexity ↓ | 2.966 |
2080
+ | gsarti/flores_101_mlt | mlt | byte_perplexity ↓ | 15.005 |
2081
+ | gsarti/flores_101_mon | mon | byte_perplexity ↓ | 3.411 |
2082
+ | gsarti/flores_101_mri | mri | byte_perplexity ↓ | 7.474 |
2083
+ | gsarti/flores_101_msa | msa | byte_perplexity ↓ | 2.571 |
2084
+ | gsarti/flores_101_mya | mya | byte_perplexity ↓ | 2.414 |
2085
+ | gsarti/flores_101_nld | nld | byte_perplexity ↓ | 4.128 |
2086
+ | gsarti/flores_101_nob | nob | byte_perplexity ↓ | 5.403 |
2087
+ | gsarti/flores_101_npi | npi | byte_perplexity ↓ | 5.199 |
2088
+ | gsarti/flores_101_nso | nso | byte_perplexity ↓ | 8.155 |
2089
+ | gsarti/flores_101_nya | nya | byte_perplexity ↓ | 8.18 |
2090
+ | gsarti/flores_101_oci | oci | byte_perplexity ↓ | 4.862 |
2091
+ | gsarti/flores_101_orm | orm | byte_perplexity ↓ | 12.912 |
2092
+ | gsarti/flores_101_ory | ory | byte_perplexity ↓ | 5.189 |
2093
+ | gsarti/flores_101_pan | pan | byte_perplexity ↓ | 4.698 |
2094
+ | gsarti/flores_101_pol | pol | byte_perplexity ↓ | 4.626 |
2095
+ | gsarti/flores_101_por | por | byte_perplexity ↓ | 1.975 |
2096
+ | gsarti/flores_101_pus | pus | byte_perplexity ↓ | 4.496 |
2097
+ | gsarti/flores_101_ron | ron | byte_perplexity ↓ | 4.965 |
2098
+ | gsarti/flores_101_rus | rus | byte_perplexity ↓ | 2.05 |
2099
+ | gsarti/flores_101_slk | slk | byte_perplexity ↓ | 6.451 |
2100
+ | gsarti/flores_101_slv | slv | byte_perplexity ↓ | 6.62 |
2101
+ | gsarti/flores_101_sna | sna | byte_perplexity ↓ | 8.462 |
2102
+ | gsarti/flores_101_snd | snd | byte_perplexity ↓ | 5.466 |
2103
+ | gsarti/flores_101_som | som | byte_perplexity ↓ | 11.959 |
2104
+ | gsarti/flores_101_spa | spa | byte_perplexity ↓ | 1.897 |
2105
+ | gsarti/flores_101_srp | srp | byte_perplexity ↓ | 2.871 |
2106
+ | gsarti/flores_101_swe | swe | byte_perplexity ↓ | 5.055 |
2107
+ | gsarti/flores_101_swh | swh | byte_perplexity ↓ | 3.697 |
2108
+ | gsarti/flores_101_tam | tam | byte_perplexity ↓ | 4.539 |
2109
+ | gsarti/flores_101_tel | tel | byte_perplexity ↓ | 5.807 |
2110
+ | gsarti/flores_101_tgk | tgk | byte_perplexity ↓ | 3.599 |
2111
+ | gsarti/flores_101_tgl | tgl | byte_perplexity ↓ | 5.667 |
2112
+ | gsarti/flores_101_tha | tha | byte_perplexity ↓ | 2.366 |
2113
+ | gsarti/flores_101_tur | tur | byte_perplexity ↓ | 4.885 |
2114
+ | gsarti/flores_101_ukr | ukr | byte_perplexity ↓ | 2.724 |
2115
+ | gsarti/flores_101_umb | umb | byte_perplexity ↓ | 12.767 |
2116
+ | gsarti/flores_101_urd | urd | byte_perplexity ↓ | 1.98 |
2117
+ | gsarti/flores_101_uzb | uzb | byte_perplexity ↓ | 12.002 |
2118
+ | gsarti/flores_101_vie | vie | byte_perplexity ↓ | 1.766 |
2119
+ | gsarti/flores_101_wol | wol | byte_perplexity ↓ | 9.144 |
2120
+ | gsarti/flores_101_xho | xho | byte_perplexity ↓ | 7.403 |
2121
+ | gsarti/flores_101_yor | yor | byte_perplexity ↓ | 5.913 |
2122
+ | gsarti/flores_101_zho_simpl | zho_simpl | byte_perplexity ↓ | 2.277 |
2123
+ | gsarti/flores_101_zho_trad | zho_trad | byte_perplexity ↓ | 2.518 |
2124
+ | gsarti/flores_101_zul | zul | byte_perplexity ↓ | 8.534 |
2125
+ | headqa | esp | acc ↑ | 0.264 |
2126
+ | hellaswag | eng | acc ↑ | 0.412 |
2127
+ | logiqa | eng | acc ↑ | 0.207 |
2128
+ | mathqa | eng | acc ↑ | 0.25 |
2129
+ | mc_taco | eng | em ↑ | 0.119 |
2130
+ | mnli (Median of 15 prompts) | eng | acc ↑ | 0.355 |
2131
+ | mnli_mismatched (Median of 15 prompts) | eng | acc ↑ | 0.352 |
2132
+ | mrpc | eng | acc ↑ | 0.586 |
2133
+ | multirc (Median of 11 prompts) | eng | acc ↑ | 0.538 |
2134
+ | openbookqa | eng | acc ↑ | 0.216 |
2135
+ | piqa | eng | acc ↑ | 0.708 |
2136
+ | prost | eng | acc ↑ | 0.227 |
2137
+ | pubmedqa | eng | acc ↑ | 0.616 |
2138
+ | qnli | eng | acc ↑ | 0.507 |
2139
+ | qqp (Median of 7 prompts) | eng | acc ↑ | 0.384 |
2140
+ | race | eng | acc ↑ | 0.352 |
2141
+ | rte (Median of 6 prompts) | eng | acc ↑ | 0.477 |
2142
+ | sciq | eng | acc ↑ | 0.892 |
2143
+ | sst (Median of 6 prompts) | eng | acc ↑ | 0.518 |
2144
+ | triviaqa | eng | acc ↑ | 0.042 |
2145
+ | tydiqa_primary (Median of 24 prompts) | eng | acc ↑ | 0.301 |
2146
+ | webqs | eng | acc ↑ | 0.017 |
2147
+ | wic (Median of 11 prompts) | eng | acc ↑ | 0.502 |
2148
+ | winogrande | eng | acc ↑ | 0.586 |
2149
+ | wnli (Median of 6 prompts) | eng | acc ↑ | 0.472 |
2150
+ | wsc (Median of 11 prompts) | eng | acc ↑ | 0.442 |
2151
+ | humaneval | python | pass@1 ↑ | 0.155 |
2152
+ | humaneval | python | pass@10 ↑ | 0.322 |
2153
+ | humaneval | python | pass@100 ↑ | 0.555 |
2154
+
2155
  **Train-time Evaluation:**
2156
 
2157
  As of 25.May.2022, 15:00 PST:
 
2162
 
2163
  - Perplexity: 8.9
2164
 
 
 
2165
  </details>
2166
  <p>&nbsp;</p>
2167
 
 
2247
  ## Model Card Authors
2248
  *Ordered roughly chronologically and by amount of time spent.*
2249
 
2250
+ Margaret Mitchell, Giada Pistilli, Yacine Jernite, Ezinwanne Ozoani, Marissa Gerchick, Nazneen Rajani, Sasha Luccioni, Irene Solaiman, Maraim Masoud, Somaieh Nikpoor, Carlos Muñoz Ferrandis, Stas Bekman, Christopher Akiki, Danish Contractor, David Lansky, Angelina McMillan-Major, Tristan Thrush, Suzana Ilić, Gérard Dupont, Shayne Longpre, Manan Dey, Stella Biderman, Douwe Kiela, Emi Baylor, Teven Le Scao, Aaron Gokaslan, Julien Launay, Niklas Muennighoff
2251