Antreas commited on
Commit
a20bf92
·
verified ·
1 Parent(s): 8f7c11e

Initial upload: ogma-mini embedding model

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +927 -0
  2. config.json +37 -0
  3. config.py +161 -0
  4. config.yaml +19 -0
  5. embeddings.py +143 -0
  6. model.pt +3 -0
  7. model.safetensors +3 -0
  8. ogma_model.py +203 -0
  9. pooling.py +99 -0
  10. results/AmazonCounterfactualClassification.json +526 -0
  11. results/AmazonPolarityClassification.json +140 -0
  12. results/AmazonReviewsClassification.json +270 -0
  13. results/ArXivHierarchicalClusteringP2P.json +47 -0
  14. results/ArXivHierarchicalClusteringS2S.json +47 -0
  15. results/ArguAna.json +167 -0
  16. results/AskUbuntuDupQuestions.json +167 -0
  17. results/BIOSSES.json +27 -0
  18. results/Banking77Classification.json +140 -0
  19. results/BiorxivClusteringP2P.json +33 -0
  20. results/BiorxivClusteringS2S.json +33 -0
  21. results/CQADupstackAndroidRetrieval.json +167 -0
  22. results/CQADupstackEnglishRetrieval.json +167 -0
  23. results/CQADupstackGamingRetrieval.json +167 -0
  24. results/CQADupstackGisRetrieval.json +167 -0
  25. results/CQADupstackMathematicaRetrieval.json +167 -0
  26. results/CQADupstackPhysicsRetrieval.json +167 -0
  27. results/CQADupstackProgrammersRetrieval.json +167 -0
  28. results/CQADupstackRetrieval.json +20 -0
  29. results/CQADupstackStatsRetrieval.json +167 -0
  30. results/CQADupstackTexRetrieval.json +167 -0
  31. results/CQADupstackUnixRetrieval.json +167 -0
  32. results/CQADupstackWebmastersRetrieval.json +167 -0
  33. results/CQADupstackWordpressRetrieval.json +167 -0
  34. results/ClimateFEVER.json +167 -0
  35. results/DBPedia.json +167 -0
  36. results/EmotionClassification.json +140 -0
  37. results/FEVER.json +167 -0
  38. results/FiQA2018.json +167 -0
  39. results/HotpotQA.json +167 -0
  40. results/ImdbClassification.json +140 -0
  41. results/MSMARCO.json +167 -0
  42. results/MTOPDomainClassification.json +270 -0
  43. results/MTOPIntentClassification.json +270 -0
  44. results/MassiveIntentClassification.json +270 -0
  45. results/MassiveScenarioClassification.json +270 -0
  46. results/MedrxivClusteringP2P.json +33 -0
  47. results/MedrxivClusteringS2S.json +33 -0
  48. results/MindSmallReranking.json +252 -0
  49. results/NFCorpus.json +167 -0
  50. results/NQ.json +167 -0
README.md ADDED
@@ -0,0 +1,927 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - mteb
7
+ - sentence-transformers
8
+ - embedding
9
+ - text-embedding
10
+ - ogma
11
+ - axiotic
12
+ - matryoshka
13
+ - small-model
14
+ model-index:
15
+ - name: ogma-mini
16
+ results:
17
+ - task:
18
+ type: Classification
19
+ dataset:
20
+ type: mteb/AmazonCounterfactualClassification
21
+ name: MTEB AmazonCounterfactualClassification
22
+ config: default
23
+ split: test
24
+ revision: 1f7e6a9d6fa6e64c53d146e428565640410c0df1
25
+ metrics:
26
+ - type: accuracy
27
+ value: 66.6
28
+ - task:
29
+ type: Classification
30
+ dataset:
31
+ type: mteb/AmazonPolarityClassification
32
+ name: MTEB AmazonPolarityClassification
33
+ config: default
34
+ split: test
35
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
36
+ metrics:
37
+ - type: accuracy
38
+ value: 70.44
39
+ - task:
40
+ type: Classification
41
+ dataset:
42
+ type: mteb/AmazonReviewsClassification
43
+ name: MTEB AmazonReviewsClassification
44
+ config: default
45
+ split: test
46
+ revision: 6b5d328eaae8ef408dd7d775040245cf86f92e9d
47
+ metrics:
48
+ - type: accuracy
49
+ value: 37.33
50
+ - task:
51
+ type: Clustering
52
+ dataset:
53
+ type: mteb/ArXivHierarchicalClusteringP2P
54
+ name: MTEB ArXivHierarchicalClusteringP2P
55
+ config: default
56
+ split: test
57
+ revision: 0bbdb47bcbe3a90093699aefeed338a0f28a7ee8
58
+ metrics:
59
+ - type: v_measure
60
+ value: 54.34
61
+ - task:
62
+ type: Clustering
63
+ dataset:
64
+ type: mteb/ArXivHierarchicalClusteringS2S
65
+ name: MTEB ArXivHierarchicalClusteringS2S
66
+ config: default
67
+ split: test
68
+ revision: b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3
69
+ metrics:
70
+ - type: v_measure
71
+ value: 49.88
72
+ - task:
73
+ type: Retrieval
74
+ dataset:
75
+ type: mteb/ArguAna
76
+ name: MTEB ArguAna
77
+ config: default
78
+ split: test
79
+ revision: c22ab2a51041ffd869aaddef7af8d8215647e41a
80
+ metrics:
81
+ - type: ndcg_at_10
82
+ value: 40.72
83
+ - task:
84
+ type: Reranking
85
+ dataset:
86
+ type: mteb/AskUbuntuDupQuestions
87
+ name: MTEB AskUbuntuDupQuestions
88
+ config: default
89
+ split: test
90
+ revision: c5691e3c48741d5f83b5cc8e630653d7a8cfc048
91
+ metrics:
92
+ - type: map
93
+ value: 52.13
94
+ - task:
95
+ type: STS
96
+ dataset:
97
+ type: mteb/BIOSSES
98
+ name: MTEB BIOSSES
99
+ config: default
100
+ split: test
101
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
102
+ metrics:
103
+ - type: cosine_spearman
104
+ value: 80.0
105
+ - task:
106
+ type: Classification
107
+ dataset:
108
+ type: mteb/Banking77Classification
109
+ name: MTEB Banking77Classification
110
+ config: default
111
+ split: test
112
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
113
+ metrics:
114
+ - type: accuracy
115
+ value: 72.93
116
+ - task:
117
+ type: Clustering
118
+ dataset:
119
+ type: mteb/BiorxivClusteringP2P
120
+ name: MTEB BiorxivClusteringP2P
121
+ config: default
122
+ split: test
123
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
124
+ metrics:
125
+ - type: v_measure
126
+ value: 30.11
127
+ - task:
128
+ type: Clustering
129
+ dataset:
130
+ type: mteb/BiorxivClusteringS2S
131
+ name: MTEB BiorxivClusteringS2S
132
+ config: default
133
+ split: test
134
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
135
+ metrics:
136
+ - type: v_measure
137
+ value: 20.36
138
+ - task:
139
+ type: Retrieval
140
+ dataset:
141
+ type: mteb/CQADupstackAndroidRetrieval
142
+ name: MTEB CQADupstackAndroidRetrieval
143
+ config: default
144
+ split: test
145
+ revision: 9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3
146
+ metrics:
147
+ - type: ndcg_at_10
148
+ value: 31.75
149
+ - task:
150
+ type: Retrieval
151
+ dataset:
152
+ type: mteb/CQADupstackEnglishRetrieval
153
+ name: MTEB CQADupstackEnglishRetrieval
154
+ config: default
155
+ split: test
156
+ revision: ad9991cb51e31e31e430383c75ffb2885547b5f0
157
+ metrics:
158
+ - type: ndcg_at_10
159
+ value: 21.11
160
+ - task:
161
+ type: Retrieval
162
+ dataset:
163
+ type: mteb/CQADupstackGamingRetrieval
164
+ name: MTEB CQADupstackGamingRetrieval
165
+ config: default
166
+ split: test
167
+ revision: 4885aa143210c98657558c04aaf3dc47cfb54340
168
+ metrics:
169
+ - type: ndcg_at_10
170
+ value: 40.44
171
+ - task:
172
+ type: Retrieval
173
+ dataset:
174
+ type: mteb/CQADupstackGisRetrieval
175
+ name: MTEB CQADupstackGisRetrieval
176
+ config: default
177
+ split: test
178
+ revision: 5003b3064772da1887988e05400cf3806fe491f2
179
+ metrics:
180
+ - type: ndcg_at_10
181
+ value: 23.96
182
+ - task:
183
+ type: Retrieval
184
+ dataset:
185
+ type: mteb/CQADupstackMathematicaRetrieval
186
+ name: MTEB CQADupstackMathematicaRetrieval
187
+ config: default
188
+ split: test
189
+ revision: 90fceea13679c63fe563ded68f3b6f06e50061de
190
+ metrics:
191
+ - type: ndcg_at_10
192
+ value: 16.98
193
+ - task:
194
+ type: Retrieval
195
+ dataset:
196
+ type: mteb/CQADupstackPhysicsRetrieval
197
+ name: MTEB CQADupstackPhysicsRetrieval
198
+ config: default
199
+ split: test
200
+ revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4
201
+ metrics:
202
+ - type: ndcg_at_10
203
+ value: 29.32
204
+ - task:
205
+ type: Retrieval
206
+ dataset:
207
+ type: mteb/CQADupstackProgrammersRetrieval
208
+ name: MTEB CQADupstackProgrammersRetrieval
209
+ config: default
210
+ split: test
211
+ revision: 6184bc1440d2dbc7612be22b50686b8826d22b32
212
+ metrics:
213
+ - type: ndcg_at_10
214
+ value: 27.28
215
+ - task:
216
+ type: Retrieval
217
+ dataset:
218
+ type: mteb/CQADupstackRetrieval
219
+ name: MTEB CQADupstackRetrieval
220
+ config: default
221
+ split: test
222
+ revision: '1'
223
+ metrics:
224
+ - type: ndcg_at_10
225
+ value: 24.82
226
+ - task:
227
+ type: Retrieval
228
+ dataset:
229
+ type: mteb/CQADupstackStatsRetrieval
230
+ name: MTEB CQADupstackStatsRetrieval
231
+ config: default
232
+ split: test
233
+ revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a
234
+ metrics:
235
+ - type: ndcg_at_10
236
+ value: 22.59
237
+ - task:
238
+ type: Retrieval
239
+ dataset:
240
+ type: mteb/CQADupstackTexRetrieval
241
+ name: MTEB CQADupstackTexRetrieval
242
+ config: default
243
+ split: test
244
+ revision: 46989137a86843e03a6195de44b09deda022eec7
245
+ metrics:
246
+ - type: ndcg_at_10
247
+ value: 17.05
248
+ - task:
249
+ type: Retrieval
250
+ dataset:
251
+ type: mteb/CQADupstackUnixRetrieval
252
+ name: MTEB CQADupstackUnixRetrieval
253
+ config: default
254
+ split: test
255
+ revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53
256
+ metrics:
257
+ - type: ndcg_at_10
258
+ value: 23.14
259
+ - task:
260
+ type: Retrieval
261
+ dataset:
262
+ type: mteb/CQADupstackWebmastersRetrieval
263
+ name: MTEB CQADupstackWebmastersRetrieval
264
+ config: default
265
+ split: test
266
+ revision: 160c094312a0e1facb97e55eeddb698c0abe3571
267
+ metrics:
268
+ - type: ndcg_at_10
269
+ value: 24.97
270
+ - task:
271
+ type: Retrieval
272
+ dataset:
273
+ type: mteb/CQADupstackWordpressRetrieval
274
+ name: MTEB CQADupstackWordpressRetrieval
275
+ config: default
276
+ split: test
277
+ revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4
278
+ metrics:
279
+ - type: ndcg_at_10
280
+ value: 19.3
281
+ - task:
282
+ type: Retrieval
283
+ dataset:
284
+ type: mteb/ClimateFEVER
285
+ name: MTEB ClimateFEVER
286
+ config: default
287
+ split: test
288
+ revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380
289
+ metrics:
290
+ - type: ndcg_at_10
291
+ value: 24.61
292
+ - task:
293
+ type: Retrieval
294
+ dataset:
295
+ type: mteb/DBPedia
296
+ name: MTEB DBPedia
297
+ config: default
298
+ split: test
299
+ revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659
300
+ metrics:
301
+ - type: ndcg_at_10
302
+ value: 29.58
303
+ - task:
304
+ type: Classification
305
+ dataset:
306
+ type: mteb/EmotionClassification
307
+ name: MTEB EmotionClassification
308
+ config: default
309
+ split: test
310
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
311
+ metrics:
312
+ - type: accuracy
313
+ value: 39.07
314
+ - task:
315
+ type: Retrieval
316
+ dataset:
317
+ type: mteb/FEVER
318
+ name: MTEB FEVER
319
+ config: default
320
+ split: test
321
+ revision: bea83ef9e8fb933d90a2f1d5515737465d613e12
322
+ metrics:
323
+ - type: ndcg_at_10
324
+ value: 69.83
325
+ - task:
326
+ type: Retrieval
327
+ dataset:
328
+ type: mteb/FiQA2018
329
+ name: MTEB FiQA2018
330
+ config: default
331
+ split: test
332
+ revision: 27a168819829fe9bcd655c2df245fb19452e8e06
333
+ metrics:
334
+ - type: ndcg_at_10
335
+ value: 20.72
336
+ - task:
337
+ type: Retrieval
338
+ dataset:
339
+ type: mteb/HotpotQA
340
+ name: MTEB HotpotQA
341
+ config: default
342
+ split: test
343
+ revision: ab518f4d6fcca38d87c25209f94beba119d02014
344
+ metrics:
345
+ - type: ndcg_at_10
346
+ value: 43.57
347
+ - task:
348
+ type: Classification
349
+ dataset:
350
+ type: mteb/ImdbClassification
351
+ name: MTEB ImdbClassification
352
+ config: default
353
+ split: test
354
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
355
+ metrics:
356
+ - type: accuracy
357
+ value: 67.25
358
+ - task:
359
+ type: Retrieval
360
+ dataset:
361
+ type: mteb/MSMARCO
362
+ name: MTEB MSMARCO
363
+ config: default
364
+ split: test
365
+ revision: c5a29a104738b98a9e76336939199e264163d4a0
366
+ metrics:
367
+ - type: ndcg_at_10
368
+ value: 0
369
+ - task:
370
+ type: Classification
371
+ dataset:
372
+ type: mteb/MTOPDomainClassification
373
+ name: MTEB MTOPDomainClassification
374
+ config: default
375
+ split: test
376
+ revision: a76d16fae880597b9c73047b50159220a441cb54
377
+ metrics:
378
+ - type: accuracy
379
+ value: 85.02
380
+ - task:
381
+ type: Classification
382
+ dataset:
383
+ type: mteb/MTOPIntentClassification
384
+ name: MTEB MTOPIntentClassification
385
+ config: default
386
+ split: test
387
+ revision: 2992d820f31312593c49a4890430aadadb0f0039
388
+ metrics:
389
+ - type: accuracy
390
+ value: 54.29
391
+ - task:
392
+ type: Classification
393
+ dataset:
394
+ type: mteb/MassiveIntentClassification
395
+ name: MTEB MassiveIntentClassification
396
+ config: default
397
+ split: test
398
+ revision: 4672e20407010da34463acc759c162ca9734bca6
399
+ metrics:
400
+ - type: accuracy
401
+ value: 60.83
402
+ - task:
403
+ type: Classification
404
+ dataset:
405
+ type: mteb/MassiveScenarioClassification
406
+ name: MTEB MassiveScenarioClassification
407
+ config: default
408
+ split: test
409
+ revision: fad2c6e8459f9e1c45d9315f4953d921437d70f8
410
+ metrics:
411
+ - type: accuracy
412
+ value: 69.43
413
+ - task:
414
+ type: Clustering
415
+ dataset:
416
+ type: mteb/MedrxivClusteringP2P
417
+ name: MTEB MedrxivClusteringP2P
418
+ config: default
419
+ split: test
420
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
421
+ metrics:
422
+ - type: v_measure
423
+ value: 30.88
424
+ - task:
425
+ type: Clustering
426
+ dataset:
427
+ type: mteb/MedrxivClusteringS2S
428
+ name: MTEB MedrxivClusteringS2S
429
+ config: default
430
+ split: test
431
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
432
+ metrics:
433
+ - type: v_measure
434
+ value: 25.35
435
+ - task:
436
+ type: Reranking
437
+ dataset:
438
+ type: mteb/MindSmallReranking
439
+ name: MTEB MindSmallReranking
440
+ config: default
441
+ split: test
442
+ revision: 227478e3235572039f4f7661840e059f31ef6eb1
443
+ metrics:
444
+ - type: map
445
+ value: 29.68
446
+ - task:
447
+ type: Retrieval
448
+ dataset:
449
+ type: mteb/NFCorpus
450
+ name: MTEB NFCorpus
451
+ config: default
452
+ split: test
453
+ revision: ec0fa4fe99da2ff19ca1214b7966684033a58814
454
+ metrics:
455
+ - type: ndcg_at_10
456
+ value: 25.54
457
+ - task:
458
+ type: Retrieval
459
+ dataset:
460
+ type: mteb/NQ
461
+ name: MTEB NQ
462
+ config: default
463
+ split: test
464
+ revision: b774495ed302d8c44a3a7ea25c90dbce03968f31
465
+ metrics:
466
+ - type: ndcg_at_10
467
+ value: 33.93
468
+ - task:
469
+ type: Retrieval
470
+ dataset:
471
+ type: mteb/QuoraRetrieval
472
+ name: MTEB QuoraRetrieval
473
+ config: default
474
+ split: test
475
+ revision: e4e08e0b7dbe3c8700f0daef558ff32256715259
476
+ metrics:
477
+ - type: ndcg_at_10
478
+ value: 51.77
479
+ - task:
480
+ type: Clustering
481
+ dataset:
482
+ type: mteb/RedditClustering
483
+ name: MTEB RedditClustering
484
+ config: default
485
+ split: test
486
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
487
+ metrics:
488
+ - type: v_measure
489
+ value: 39.5
490
+ - task:
491
+ type: Clustering
492
+ dataset:
493
+ type: mteb/RedditClusteringP2P
494
+ name: MTEB RedditClusteringP2P
495
+ config: default
496
+ split: test
497
+ revision: 385e3cb46b4cfa89021f56c4380204149d0efe33
498
+ metrics:
499
+ - type: v_measure
500
+ value: 49.08
501
+ - task:
502
+ type: Retrieval
503
+ dataset:
504
+ type: mteb/SCIDOCS
505
+ name: MTEB SCIDOCS
506
+ config: default
507
+ split: test
508
+ revision: f8c2fcf00f625baaa80f62ec5bd9e1fff3b8ae88
509
+ metrics:
510
+ - type: ndcg_at_10
511
+ value: 13.8
512
+ - task:
513
+ type: STS
514
+ dataset:
515
+ type: mteb/SICK-R
516
+ name: MTEB SICK-R
517
+ config: default
518
+ split: test
519
+ revision: 20a6d6f312dd54037fe07a32d58e5e168867909d
520
+ metrics:
521
+ - type: cosine_spearman
522
+ value: 71.83
523
+ - task:
524
+ type: STS
525
+ dataset:
526
+ type: mteb/STS12
527
+ name: MTEB STS12
528
+ config: default
529
+ split: test
530
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
531
+ metrics:
532
+ - type: cosine_spearman
533
+ value: 71.93
534
+ - task:
535
+ type: STS
536
+ dataset:
537
+ type: mteb/STS13
538
+ name: MTEB STS13
539
+ config: default
540
+ split: test
541
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
542
+ metrics:
543
+ - type: cosine_spearman
544
+ value: 79.27
545
+ - task:
546
+ type: STS
547
+ dataset:
548
+ type: mteb/STS14
549
+ name: MTEB STS14
550
+ config: default
551
+ split: test
552
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
553
+ metrics:
554
+ - type: cosine_spearman
555
+ value: 75.96
556
+ - task:
557
+ type: STS
558
+ dataset:
559
+ type: mteb/STS15
560
+ name: MTEB STS15
561
+ config: default
562
+ split: test
563
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
564
+ metrics:
565
+ - type: cosine_spearman
566
+ value: 83.06
567
+ - task:
568
+ type: STS
569
+ dataset:
570
+ type: mteb/STS16
571
+ name: MTEB STS16
572
+ config: default
573
+ split: test
574
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
575
+ metrics:
576
+ - type: cosine_spearman
577
+ value: 79.04
578
+ - task:
579
+ type: STS
580
+ dataset:
581
+ type: mteb/STSBenchmark
582
+ name: MTEB STSBenchmark
583
+ config: default
584
+ split: test
585
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
586
+ metrics:
587
+ - type: cosine_spearman
588
+ value: 80.57
589
+ - task:
590
+ type: Reranking
591
+ dataset:
592
+ type: mteb/SciDocsRR
593
+ name: MTEB SciDocsRR
594
+ config: default
595
+ split: test
596
+ revision: 39b8377811871075eed9de3b8a7e21aaa6acb3d8
597
+ metrics:
598
+ - type: map
599
+ value: 69.65
600
+ - task:
601
+ type: Retrieval
602
+ dataset:
603
+ type: mteb/SciFact
604
+ name: MTEB SciFact
605
+ config: default
606
+ split: test
607
+ revision: d56462d0e63a25450459c4f213e49ffdb866f7f9
608
+ metrics:
609
+ - type: ndcg_at_10
610
+ value: 53.01
611
+ - task:
612
+ type: PairClassification
613
+ dataset:
614
+ type: mteb/SprintDuplicateQuestions
615
+ name: MTEB SprintDuplicateQuestions
616
+ config: default
617
+ split: test
618
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
619
+ metrics:
620
+ - type: cosine_ap
621
+ value: 94.96
622
+ - task:
623
+ type: Clustering
624
+ dataset:
625
+ type: mteb/StackExchangeClustering
626
+ name: MTEB StackExchangeClustering
627
+ config: default
628
+ split: test
629
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
630
+ metrics:
631
+ - type: v_measure
632
+ value: 44.64
633
+ - task:
634
+ type: Clustering
635
+ dataset:
636
+ type: mteb/StackExchangeClusteringP2P
637
+ name: MTEB StackExchangeClusteringP2P
638
+ config: default
639
+ split: test
640
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
641
+ metrics:
642
+ - type: v_measure
643
+ value: 33.41
644
+ - task:
645
+ type: Reranking
646
+ dataset:
647
+ type: mteb/StackOverflowDupQuestions
648
+ name: MTEB StackOverflowDupQuestions
649
+ config: default
650
+ split: test
651
+ revision: 5debda000fe8e27ebb5c123d38081f92e1847a59
652
+ metrics:
653
+ - type: map
654
+ value: 38.13
655
+ - task:
656
+ type: Summarization
657
+ dataset:
658
+ type: mteb/SummEval
659
+ name: MTEB SummEval
660
+ config: default
661
+ split: test
662
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
663
+ metrics:
664
+ - type: cosine_spearman
665
+ value: 31.33
666
+ - task:
667
+ type: Retrieval
668
+ dataset:
669
+ type: mteb/TRECCOVID
670
+ name: MTEB TRECCOVID
671
+ config: default
672
+ split: test
673
+ revision: bb9466bac8153a0349341eb1b22e06409e78ef4e
674
+ metrics:
675
+ - type: ndcg_at_10
676
+ value: 61.42
677
+ - task:
678
+ type: Retrieval
679
+ dataset:
680
+ type: mteb/Touche2020
681
+ name: MTEB Touche2020
682
+ config: default
683
+ split: test
684
+ revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f
685
+ metrics:
686
+ - type: ndcg_at_10
687
+ value: 24.09
688
+ - task:
689
+ type: Classification
690
+ dataset:
691
+ type: mteb/ToxicConversationsClassification
692
+ name: MTEB ToxicConversationsClassification
693
+ config: default
694
+ split: test
695
+ revision: edfaf9da55d3dd50d43143d90c1ac476895ae6de
696
+ metrics:
697
+ - type: accuracy
698
+ value: 61.23
699
+ - task:
700
+ type: Classification
701
+ dataset:
702
+ type: mteb/TweetSentimentExtractionClassification
703
+ name: MTEB TweetSentimentExtractionClassification
704
+ config: default
705
+ split: test
706
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
707
+ metrics:
708
+ - type: accuracy
709
+ value: 57.13
710
+ - task:
711
+ type: Clustering
712
+ dataset:
713
+ type: mteb/TwentyNewsgroupsClustering
714
+ name: MTEB TwentyNewsgroupsClustering
715
+ config: default
716
+ split: test
717
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
718
+ metrics:
719
+ - type: v_measure
720
+ value: 33.64
721
+ - task:
722
+ type: PairClassification
723
+ dataset:
724
+ type: mteb/TwitterSemEval2015
725
+ name: MTEB TwitterSemEval2015
726
+ config: default
727
+ split: test
728
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
729
+ metrics:
730
+ - type: cosine_ap
731
+ value: 60.68
732
+ - task:
733
+ type: PairClassification
734
+ dataset:
735
+ type: mteb/TwitterURLCorpus
736
+ name: MTEB TwitterURLCorpus
737
+ config: default
738
+ split: test
739
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
740
+ metrics:
741
+ - type: cosine_ap
742
+ value: 83.33
743
+ ---
744
+
745
+ # ogma-mini
746
+
747
+ **3.5M parameter text embedding model** by [Axiotic AI](https://axiotic.ai), achieving **51.42 average** on MTEB English v1 (54/54 tasks).
748
+
749
+ 2-layer transformer, 256 hidden dim, 64 embedding dim — ultra-lightweight.
750
+
751
+ ## Highlights
752
+
753
+ - **3.5M parameters** — small enough for CPU inference, edge deployment, and resource-constrained environments
754
+ - **51.42 MTEB average** — outperforms Potion-32M (51.22) despite being significantly smaller
755
+ - **Matryoshka embeddings** — use dimensions [32, 64, 128, 256] for flexible storage/compute tradeoffs
756
+ - **Asymmetric encoding** — dedicated `[QRY]`, `[DOC]`, `[SYM]` task tokens for query-document and symmetric tasks
757
+ - **1024 token context** — handles longer passages than typical small models (Potion: 512)
758
+ - **Pure PyTorch** — no external transformer library dependencies
759
+
760
+ ## Architecture
761
+
762
+ | Component | Details |
763
+ |-----------|---------|
764
+ | Parameters | 3.5M |
765
+ | Layers | 2 |
766
+ | Hidden dim (d_model) | 256 |
767
+ | Embedding dim (d_embed) | 64 |
768
+ | Output dim (d_output) | 256 |
769
+ | Attention heads | 4 |
770
+ | Max sequence length | 1024 |
771
+ | Matryoshka dims | [32, 64, 128, 256] |
772
+ | Pooling | Mean (mask-aware) |
773
+ | Position encoding | RoPE |
774
+ | FFN | SwiGLU |
775
+ | Normalization | Pre-LayerNorm |
776
+ | Tokenizer | SentencePiece Unigram (30K vocab) |
777
+ | Training | Knowledge distillation from teacher model |
778
+
779
+ ## MTEB Results
780
+
781
+ ### Category-Level Scores
782
+
783
+ | Category | ogma-mini | Potion-32M | Potion-8M | vs Potion-32M |
784
+ |----------|-----------|------------|-----------|---------------|
785
+ | Classification | **61.74** | 66.01 | 64.46 | -4.27 |
786
+ | Clustering | **37.38** | 39.24 | 36.88 | -1.86 |
787
+ | PairClassification | **79.66** | 78.17 | 76.62 | +1.49 |
788
+ | Reranking | **47.4** | 50.92 | 49.73 | -3.52 |
789
+ | Retrieval | **36.21** | 32.21 | 30.43 | +4.00 |
790
+ | STS | **77.71** | 73.86 | 72.93 | +3.85 |
791
+ | Summarization | **31.33** | 29.77 | 29.26 | +1.56 |
792
+ | **Overall** | **51.42** | 51.22 | 49.58 | **+0.20** |
793
+
794
+ > **Potion scores are locally reproduced** using the same evaluation pipeline and hardware for fair head-to-head comparison. These are not self-reported numbers from the Potion model card.
795
+
796
+ ## Usage
797
+
798
+ ### Quick Start
799
+
800
+ ```python
801
+ import torch
802
+ import numpy as np
803
+ from pathlib import Path
804
+
805
+ # Load model
806
+ from ogma_model import OgmaModel
807
+ from config import OgmaConfig
808
+ from tokenizer import OgmaTokenizer
809
+
810
+ # Load from checkpoint directory
811
+ model = OgmaModel.from_checkpoint("path/to/ogma-mini", device="cpu")
812
+ model.eval()
813
+
814
+ # Load tokenizer (uses the SentencePiece model embedded in tokenizer.json)
815
+ # The tokenizer needs the .model file — extract from tokenizer.json or use:
816
+ tokenizer = OgmaTokenizer("path/to/tokenizer.model")
817
+
818
+ # Encode text
819
+ texts = ["This is a query", "This is a document"]
820
+ encoded = tokenizer.batch_encode(texts, max_length=1024)
821
+
822
+ token_ids = torch.tensor(encoded["input_ids"])
823
+ attention_mask = torch.tensor(encoded["attention_mask"])
824
+
825
+ # Use task tokens for asymmetric encoding
826
+ from config import TaskToken
827
+
828
+ with torch.no_grad():
829
+ # For symmetric tasks (STS, clustering, classification)
830
+ embeddings = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
831
+
832
+ # For retrieval — encode queries and documents separately
833
+ query_embs = model.encode(token_ids[:1], attention_mask[:1], task=TaskToken.QRY)
834
+ doc_embs = model.encode(token_ids[1:], attention_mask[1:], task=TaskToken.DOC)
835
+
836
+ print(f"Embedding shape: {embeddings.shape}") # (2, 256)
837
+ ```
838
+
839
+ ### Matryoshka Dimensionality Reduction
840
+
841
+ ```python
842
+ # Full embeddings: 256d
843
+ full_embs = model.encode(token_ids, attention_mask, task=TaskToken.SYM)
844
+
845
+ # Reduce to any Matryoshka dimension: [32, 64, 128, 256]
846
+ dim = 64
847
+ reduced_embs = torch.nn.functional.normalize(full_embs[:, :dim], p=2, dim=-1)
848
+ # These reduced embeddings are trained to be effective at lower dims
849
+ ```
850
+
851
+ ### Loading with safetensors
852
+
853
+ ```python
854
+ import torch
855
+ import yaml
856
+ from safetensors.torch import load_file
857
+ from ogma_model import OgmaModel
858
+ from config import OgmaConfig
859
+
860
+ # Load config
861
+ with open("path/to/ogma-mini/config.json") as f:
862
+ import json
863
+ config_dict = json.load(f)
864
+
865
+ config = OgmaConfig.from_dict(config_dict)
866
+ model = OgmaModel(config)
867
+
868
+ # Load weights from safetensors
869
+ state_dict = load_file("path/to/ogma-mini/model.safetensors")
870
+ model.load_state_dict(state_dict)
871
+ model.eval()
872
+ ```
873
+
874
+ ## Task Tokens
875
+
876
+ Ogma uses task-specific prefix tokens for asymmetric encoding:
877
+
878
+ | Token | ID | Use Case |
879
+ |-------|-----|----------|
880
+ | `[QRY]` | 4 | Query encoding for retrieval |
881
+ | `[DOC]` | 5 | Document/passage encoding for retrieval |
882
+ | `[SYM]` | 6 | Symmetric tasks (STS, classification, clustering) |
883
+
884
+ For retrieval tasks, encode queries with `[QRY]` and documents with `[DOC]`. For all other tasks, use `[SYM]`.
885
+
886
+ ## Training
887
+
888
+ Ogma is trained via **knowledge distillation** from a larger teacher embedding model. The training pipeline:
889
+
890
+ 1. **Tokenizer**: SentencePiece Unigram model trained on the distillation corpus (30K vocab)
891
+ 2. **Token embeddings**: PCA-reduced embeddings from the teacher model, providing a strong initialization
892
+ 3. **Distillation**: MSE loss between student and teacher embeddings, with Matryoshka loss at multiple dimensions
893
+ 4. **Architecture**: Standard transformer encoder with RoPE positional encoding and SwiGLU FFN
894
+
895
+ ## Files
896
+
897
+ | File | Description |
898
+ |------|-------------|
899
+ | `model.safetensors` | Model weights (safetensors format) |
900
+ | `model.pt` | Model weights (PyTorch format) |
901
+ | `config.json` | Model configuration |
902
+ | `config.yaml` | Original training config |
903
+ | `tokenizer.json` | HuggingFace tokenizer |
904
+ | `tokenizer_config.json` | Tokenizer configuration |
905
+ | `token_embeds_128d.npy` | Pre-computed token embeddings (30K × 128, float16) |
906
+ | `ogma_model.py` | OgmaModel class |
907
+ | `config.py` | OgmaConfig dataclass |
908
+ | `embeddings.py` | Token embedding + RoPE |
909
+ | `pooling.py` | Pooling strategies |
910
+ | `variants/transformer.py` | Transformer encoder variant |
911
+ | `tokenizer.py` | OgmaTokenizer wrapper |
912
+ | `results/` | MTEB result JSONs |
913
+
914
+ ## Citation
915
+
916
+ ```bibtex
917
+ @misc{ogma2026,
918
+ title={Ogma: Small High-Performance Text Embeddings},
919
+ author={Axiotic AI},
920
+ year={2026},
921
+ url={https://huggingface.co/axiotic/ogma-mini}
922
+ }
923
+ ```
924
+
925
+ ## License
926
+
927
+ MIT
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "OgmaModel"
4
+ ],
5
+ "model_type": "ogma",
6
+ "auto_map": {
7
+ "AutoModel": "ogma_model.OgmaModel"
8
+ },
9
+ "variant": "transformer",
10
+ "d_embed": 64,
11
+ "d_model": 256,
12
+ "d_output": 256,
13
+ "n_layers": 2,
14
+ "n_heads": 4,
15
+ "vocab_size": 30000,
16
+ "max_seq_len": 1024,
17
+ "matryoshka_dims": [
18
+ 32,
19
+ 64,
20
+ 128,
21
+ 256
22
+ ],
23
+ "pooling": "mean",
24
+ "ffn_mult": 2.6666666666666665,
25
+ "conv_kernel_size": 7,
26
+ "spatial_rank": 32,
27
+ "n_random_features": 128,
28
+ "dropout": 0.0,
29
+ "pad_id": 0,
30
+ "unk_id": 1,
31
+ "bos_id": 2,
32
+ "eos_id": 3,
33
+ "qry_id": 4,
34
+ "doc_id": 5,
35
+ "sym_id": 6,
36
+ "n_special_tokens": 7
37
+ }
config.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Model configuration for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ from dataclasses import dataclass, field
6
+ from enum import StrEnum
7
+ from typing import Any
8
+
9
+ __all__ = ["OgmaConfig", "VariantType", "PoolingType", "TaskToken"]
10
+
11
+
12
+ class VariantType(StrEnum):
13
+ """Architecture variant identifiers."""
14
+
15
+ TRANSFORMER = "transformer"
16
+ DEEP_NARROW = "deep_narrow"
17
+ CONV = "conv"
18
+ LINEAR_ATTENTION = "linear_attention"
19
+ MLP_MIXER = "mlp_mixer"
20
+ TRANSFORMER_RESA = "transformer_resa"
21
+ GLA = "gla"
22
+
23
+
24
+ class PoolingType(StrEnum):
25
+ """Pooling strategy identifiers."""
26
+
27
+ TASK_TOKEN = "task_token"
28
+ LATENT_ATTENTION = "latent_attention"
29
+ MEAN = "mean"
30
+
31
+
32
+ class TaskToken(StrEnum):
33
+ """Task token identifiers for asymmetric encoding."""
34
+
35
+ QRY = "QRY"
36
+ DOC = "DOC"
37
+ SYM = "SYM"
38
+
39
+
40
+ @dataclass
41
+ class OgmaConfig:
42
+ """Configuration for an Ogma model instance.
43
+
44
+ Args:
45
+ variant: Architecture variant to use.
46
+ d_embed: Token embedding dimension (from teacher PCA).
47
+ d_model: Internal model dimension after projection.
48
+ n_layers: Number of fusion layers/blocks.
49
+ n_heads: Number of attention heads (attention variants only).
50
+ vocab_size: Vocabulary size for embedding table.
51
+ max_seq_len: Maximum sequence length.
52
+ matryoshka_dims: Nested output dimensions for Matryoshka.
53
+ pooling: Pooling strategy.
54
+ d_output: Final output dimension.
55
+ ffn_mult: SwiGLU FFN hidden dimension multiplier.
56
+ conv_kernel_size: Kernel size for conv variant.
57
+ spatial_rank: Rank of spatial mixing in MLP mixer.
58
+ n_random_features: Random features for linear attention.
59
+ dropout: Dropout rate (0 for inference).
60
+ """
61
+
62
+ variant: VariantType = VariantType.TRANSFORMER
63
+ d_embed: int = 128
64
+ d_model: int = 256
65
+ n_layers: int = 1
66
+ n_heads: int = 4
67
+ vocab_size: int = 30_000
68
+ max_seq_len: int = 512
69
+ matryoshka_dims: list[int] = field(
70
+ default_factory=lambda: [32, 64, 128, 256]
71
+ )
72
+ pooling: PoolingType = PoolingType.TASK_TOKEN
73
+ d_output: int = 256
74
+ ffn_mult: float = 8 / 3 # SwiGLU: 8/3 * d_model ≈ 683 for d=256
75
+ conv_kernel_size: int = 7
76
+ spatial_rank: int = 32
77
+ n_random_features: int = 128
78
+ dropout: float = 0.0
79
+
80
+ # ReSA scorer settings
81
+ scorer_type: str = "dot"
82
+ scorer_alpha_init: float = 0.1
83
+ scorer_hidden: int = 0 # 0 defaults to d_head
84
+
85
+ # GLA (Gated Linear Attention) settings
86
+ gla_expand_k: float = 0.5 # key dim expansion (key_dim = d_model * expand_k)
87
+ gla_expand_v: float = 1.0 # value dim expansion (value_dim = d_model * expand_v)
88
+ gla_gate_low_rank_dim: int = 16 # low-rank dim for gating projection
89
+ gla_gate_logit_normalizer: int = 16 # normalizer for gate logits
90
+ gla_use_short_conv: bool = True # whether to use short conv on Q,K,V
91
+ gla_conv_size: int = 4 # short conv kernel size
92
+
93
+ # Special token IDs
94
+ pad_id: int = 0
95
+ unk_id: int = 1
96
+ bos_id: int = 2
97
+ eos_id: int = 3
98
+ qry_id: int = 4
99
+ doc_id: int = 5
100
+ sym_id: int = 6
101
+ n_special_tokens: int = 7
102
+
103
+ @property
104
+ def d_head(self) -> int:
105
+ """Per-head dimension."""
106
+ return self.d_model // self.n_heads
107
+
108
+ @property
109
+ def ffn_hidden(self) -> int:
110
+ """SwiGLU FFN hidden dimension."""
111
+ return int(self.d_model * self.ffn_mult)
112
+
113
+ def task_token_id(self, task: TaskToken) -> int:
114
+ """Return token ID for a task token."""
115
+ mapping = {
116
+ TaskToken.QRY: self.qry_id,
117
+ TaskToken.DOC: self.doc_id,
118
+ TaskToken.SYM: self.sym_id,
119
+ }
120
+ return mapping[task]
121
+
122
+ def to_dict(self) -> dict[str, Any]:
123
+ """Serialize config to dictionary."""
124
+ return {
125
+ "variant": self.variant.value,
126
+ "d_embed": self.d_embed,
127
+ "d_model": self.d_model,
128
+ "n_layers": self.n_layers,
129
+ "n_heads": self.n_heads,
130
+ "vocab_size": self.vocab_size,
131
+ "max_seq_len": self.max_seq_len,
132
+ "matryoshka_dims": self.matryoshka_dims,
133
+ "pooling": self.pooling.value,
134
+ "d_output": self.d_output,
135
+ "ffn_mult": self.ffn_mult,
136
+ "conv_kernel_size": self.conv_kernel_size,
137
+ "spatial_rank": self.spatial_rank,
138
+ "n_random_features": self.n_random_features,
139
+ "dropout": self.dropout,
140
+ "scorer_type": self.scorer_type,
141
+ "scorer_alpha_init": self.scorer_alpha_init,
142
+ "scorer_hidden": self.scorer_hidden,
143
+ "gla_expand_k": self.gla_expand_k,
144
+ "gla_expand_v": self.gla_expand_v,
145
+ "gla_gate_low_rank_dim": self.gla_gate_low_rank_dim,
146
+ "gla_gate_logit_normalizer": self.gla_gate_logit_normalizer,
147
+ "gla_use_short_conv": self.gla_use_short_conv,
148
+ "gla_conv_size": self.gla_conv_size,
149
+ }
150
+
151
+ @classmethod
152
+ def from_dict(cls, data: dict[str, Any]) -> OgmaConfig:
153
+ """Deserialize config from dictionary."""
154
+ data = dict(data)
155
+ if "variant" in data:
156
+ data["variant"] = VariantType(data["variant"])
157
+ if "pooling" in data:
158
+ data["pooling"] = PoolingType(data["pooling"])
159
+ known = {f.name for f in cls.__dataclass_fields__.values()}
160
+ filtered = {k: v for k, v in data.items() if k in known}
161
+ return cls(**filtered)
config.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ conv_kernel_size: 7
2
+ d_embed: 64
3
+ d_model: 256
4
+ d_output: 256
5
+ dropout: 0.0
6
+ ffn_mult: 2.6666666666666665
7
+ matryoshka_dims:
8
+ - 32
9
+ - 64
10
+ - 128
11
+ - 256
12
+ max_seq_len: 1024
13
+ n_heads: 4
14
+ n_layers: 2
15
+ n_random_features: 128
16
+ pooling: mean
17
+ spatial_rank: 32
18
+ variant: transformer
19
+ vocab_size: 30000
embeddings.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Token embeddings, task token embeddings, and RoPE for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+
8
+ from ogma.model.config import OgmaConfig
9
+
10
+ __all__ = ["TokenEmbedding", "RotaryPositionalEncoding"]
11
+
12
+
13
+ class TokenEmbedding(nn.Module):
14
+ """Token embedding with optional linear projection.
15
+
16
+ Loads a vocab_size x d_embed embedding table and projects to d_model.
17
+ Includes 3 learnable task token embeddings ([QRY], [DOC], [SYM]).
18
+ """
19
+
20
+ def __init__(self, config: OgmaConfig) -> None:
21
+ super().__init__()
22
+ self.config = config
23
+ self.embed = nn.Embedding(
24
+ config.vocab_size + config.n_special_tokens,
25
+ config.d_embed,
26
+ padding_idx=config.pad_id,
27
+ )
28
+ if config.d_embed != config.d_model:
29
+ self.proj = nn.Linear(config.d_embed, config.d_model)
30
+ else:
31
+ self.proj = nn.Identity() # type: ignore[assignment]
32
+
33
+ # Task token embeddings are learned separately at d_model
34
+ self.task_tokens = nn.Embedding(3, config.d_model)
35
+
36
+ def forward(
37
+ self,
38
+ token_ids: torch.Tensor,
39
+ task_token_ids: torch.Tensor,
40
+ ) -> torch.Tensor:
41
+ """Embed tokens and prepend task token.
42
+
43
+ Args:
44
+ token_ids: (B, S) token IDs.
45
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
46
+
47
+ Returns:
48
+ (B, S+1, d_model) embeddings with task token prepended.
49
+ """
50
+ # Embed and project regular tokens
51
+ x = self.embed(token_ids) # (B, S, d_embed)
52
+ x = self.proj(x) # (B, S, d_model)
53
+
54
+ # Get task token embeddings (map 4,5,6 -> 0,1,2)
55
+ task_idx = task_token_ids - self.config.qry_id # (B,)
56
+ task_emb = self.task_tokens(task_idx) # (B, d_model)
57
+ task_emb = task_emb.unsqueeze(1) # (B, 1, d_model)
58
+
59
+ # Prepend task token
60
+ return torch.cat([task_emb, x], dim=1) # (B, S+1, d_model)
61
+
62
+ def load_pretrained_embeddings(
63
+ self, embeddings: torch.Tensor
64
+ ) -> None:
65
+ """Load pre-computed token embeddings (e.g., from teacher PCA).
66
+
67
+ Args:
68
+ embeddings: (vocab_size, d_embed) tensor.
69
+ """
70
+ with torch.no_grad():
71
+ n = min(embeddings.shape[0], self.config.vocab_size)
72
+ start = self.config.n_special_tokens
73
+ self.embed.weight[start : n + start] = embeddings[:n]
74
+
75
+
76
+ class RotaryPositionalEncoding(nn.Module):
77
+ """Rotary Position Embedding (RoPE). Zero trainable parameters."""
78
+
79
+ def __init__(self, dim: int, max_seq_len: int = 512) -> None:
80
+ super().__init__()
81
+ inv_freq = 1.0 / (
82
+ 10000.0 ** (torch.arange(0, dim, 2).float() / dim)
83
+ )
84
+ self.register_buffer("inv_freq", inv_freq)
85
+ self._build_cache(max_seq_len)
86
+
87
+ def _build_cache(self, seq_len: int) -> None:
88
+ inv_freq: torch.Tensor = self.inv_freq # type: ignore[assignment]
89
+ t = torch.arange(seq_len, dtype=inv_freq.dtype)
90
+ freqs = torch.outer(t, inv_freq)
91
+ cos_cached = freqs.cos()
92
+ sin_cached = freqs.sin()
93
+ self.register_buffer("cos_cached", cos_cached, persistent=False)
94
+ self.register_buffer("sin_cached", sin_cached, persistent=False)
95
+
96
+ def forward(self, x: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
97
+ """Return cos and sin for sequence length of x.
98
+
99
+ Args:
100
+ x: (B, S, ...) tensor to determine sequence length.
101
+
102
+ Returns:
103
+ Tuple of (cos, sin) each of shape (S, d_head//2).
104
+ """
105
+ seq_len = x.shape[1]
106
+ cos: torch.Tensor = self.cos_cached # type: ignore[assignment]
107
+ sin: torch.Tensor = self.sin_cached # type: ignore[assignment]
108
+ if seq_len > cos.shape[0]:
109
+ self._build_cache(seq_len)
110
+ cos = self.cos_cached # type: ignore[assignment]
111
+ sin = self.sin_cached # type: ignore[assignment]
112
+ return cos[:seq_len], sin[:seq_len]
113
+
114
+
115
+ def apply_rope(
116
+ q: torch.Tensor,
117
+ k: torch.Tensor,
118
+ cos: torch.Tensor,
119
+ sin: torch.Tensor,
120
+ ) -> tuple[torch.Tensor, torch.Tensor]:
121
+ """Apply rotary embeddings to query and key tensors.
122
+
123
+ Args:
124
+ q: (B, n_heads, S, d_head) query tensor.
125
+ k: (B, n_heads, S, d_head) key tensor.
126
+ cos: (S, d_head//2) cosine cache.
127
+ sin: (S, d_head//2) sine cache.
128
+
129
+ Returns:
130
+ Rotated (q, k) tensors.
131
+ """
132
+
133
+ def _rotate(x: torch.Tensor) -> torch.Tensor:
134
+ x1 = x[..., : x.shape[-1] // 2]
135
+ x2 = x[..., x.shape[-1] // 2 :]
136
+ cos_exp = cos.unsqueeze(0).unsqueeze(0) # (1, 1, S, d_head//2)
137
+ sin_exp = sin.unsqueeze(0).unsqueeze(0)
138
+ return torch.cat(
139
+ [x1 * cos_exp - x2 * sin_exp, x2 * cos_exp + x1 * sin_exp],
140
+ dim=-1,
141
+ )
142
+
143
+ return _rotate(q), _rotate(k)
model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50760494a72d662b0bdd20b52d7f8ad949b2a1fae2947cdf8ccf2979c1d70a84
3
+ size 14056272
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:70cf039002c744bba94f62ad8acb6625b6060b9bced2d947bc4e09027c483b5f
3
+ size 14049984
ogma_model.py ADDED
@@ -0,0 +1,203 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """OgmaModel — top-level model wrapping any architecture variant."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, TaskToken, VariantType
10
+ from ogma.model.embeddings import TokenEmbedding
11
+ from ogma.model.pooling import create_pooling
12
+ from ogma.model.variants.conv import ConvVariant
13
+ from ogma.model.variants.deep_narrow import DeepNarrowVariant
14
+ from ogma.model.variants.linear_attention import LinearAttentionVariant
15
+ from ogma.model.variants.mlp_mixer import MLPMixerVariant
16
+ from ogma.model.variants.transformer import TransformerVariant
17
+ from ogma.model.variants.transformer_resa import TransformerReSAVariant
18
+ from ogma.model.variants.gla import GLAVariant
19
+
20
+ __all__ = ["OgmaModel"]
21
+
22
+ MAX_PARAMS = 10_000_000
23
+
24
+
25
+ def _build_variant(config: OgmaConfig) -> nn.Module:
26
+ """Instantiate the appropriate architecture variant."""
27
+ if config.variant == VariantType.TRANSFORMER:
28
+ return TransformerVariant(config)
29
+ elif config.variant == VariantType.DEEP_NARROW:
30
+ return DeepNarrowVariant(config)
31
+ elif config.variant == VariantType.CONV:
32
+ return ConvVariant(config)
33
+ elif config.variant == VariantType.LINEAR_ATTENTION:
34
+ return LinearAttentionVariant(config)
35
+ elif config.variant == VariantType.MLP_MIXER:
36
+ return MLPMixerVariant(config)
37
+ elif config.variant == VariantType.TRANSFORMER_RESA:
38
+ return TransformerReSAVariant(config)
39
+ elif config.variant == VariantType.GLA:
40
+ return GLAVariant(config)
41
+ raise ValueError(f"Unknown variant: {config.variant}")
42
+
43
+
44
+ class OgmaModel(nn.Module):
45
+ """Ogma embedding model.
46
+
47
+ Wraps any architecture variant with shared embedding, pooling, and
48
+ normalization. Produces L2-normalized embeddings at d_output dimensions,
49
+ Matryoshka-compatible at configured sub-dimensions.
50
+ """
51
+
52
+ def __init__(self, config: OgmaConfig) -> None:
53
+ super().__init__()
54
+ self.config = config
55
+ self.embedding = TokenEmbedding(config)
56
+ self.variant = _build_variant(config)
57
+ self.pooling = create_pooling(config)
58
+
59
+ # Output projection if variant output != d_output
60
+ needs_proj = (
61
+ config.variant == VariantType.DEEP_NARROW
62
+ and config.d_model != config.d_output
63
+ )
64
+ # DeepNarrowVariant already has output_proj, so no extra needed here
65
+ if not needs_proj and config.d_model != config.d_output:
66
+ self.output_proj: nn.Module = nn.Linear(
67
+ config.d_model, config.d_output
68
+ )
69
+ else:
70
+ self.output_proj = nn.Identity()
71
+
72
+ def forward(
73
+ self,
74
+ token_ids: torch.Tensor,
75
+ attention_mask: torch.Tensor,
76
+ task_token_ids: torch.Tensor,
77
+ ) -> torch.Tensor:
78
+ """Forward pass producing L2-normalized embeddings.
79
+
80
+ Args:
81
+ token_ids: (B, S) token IDs.
82
+ attention_mask: (B, S) attention mask (1=valid, 0=pad).
83
+ task_token_ids: (B,) task token IDs (4=QRY, 5=DOC, 6=SYM).
84
+
85
+ Returns:
86
+ (B, d_output) L2-normalized embeddings.
87
+ """
88
+ # Embed tokens with task token prepended -> (B, S+1, d_model)
89
+ x = self.embedding(token_ids, task_token_ids)
90
+
91
+ # Extend attention mask for prepended task token
92
+ task_mask = torch.ones(
93
+ attention_mask.shape[0], 1,
94
+ device=attention_mask.device,
95
+ dtype=attention_mask.dtype,
96
+ )
97
+ extended_mask = torch.cat([task_mask, attention_mask], dim=1)
98
+
99
+ # Run through variant
100
+ x = self.variant(x, extended_mask)
101
+
102
+ # Pool
103
+ x = self.pooling(x, extended_mask)
104
+
105
+ # Project if needed
106
+ x = self.output_proj(x)
107
+
108
+ # L2 normalize
109
+ return F.normalize(x, p=2, dim=-1)
110
+
111
+ def encode(
112
+ self,
113
+ token_ids: torch.Tensor,
114
+ attention_mask: torch.Tensor,
115
+ task: TaskToken = TaskToken.SYM,
116
+ ) -> torch.Tensor:
117
+ """Encode tokens with a specified task mode.
118
+
119
+ Args:
120
+ token_ids: (B, S) token IDs.
121
+ attention_mask: (B, S) attention mask.
122
+ task: Task token to use.
123
+
124
+ Returns:
125
+ (B, d_output) L2-normalized embeddings.
126
+ """
127
+ task_ids = torch.full(
128
+ (token_ids.shape[0],),
129
+ self.config.task_token_id(task),
130
+ device=token_ids.device,
131
+ dtype=torch.long,
132
+ )
133
+ return self.forward(token_ids, attention_mask, task_ids)
134
+
135
+ def param_count(self) -> int:
136
+ """Count total trainable parameters."""
137
+ return sum(p.numel() for p in self.parameters() if p.requires_grad)
138
+
139
+ def assert_param_budget(self) -> None:
140
+ """Assert model is under the 10M parameter budget."""
141
+ count = self.param_count()
142
+ assert count < MAX_PARAMS, (
143
+ f"Model has {count:,} params, exceeds {MAX_PARAMS:,} budget"
144
+ )
145
+
146
+ @classmethod
147
+ def from_config(cls, config: OgmaConfig) -> OgmaModel:
148
+ """Factory method to build a model from config."""
149
+ model = cls(config)
150
+ model.assert_param_budget()
151
+ return model
152
+
153
+ @classmethod
154
+ def from_checkpoint(
155
+ cls,
156
+ path: str,
157
+ device: str = "cpu",
158
+ ) -> OgmaModel:
159
+ """Load model from a checkpoint directory.
160
+
161
+ Args:
162
+ path: Path to checkpoint directory containing config.yaml
163
+ and model.pt.
164
+ device: Device to load model to.
165
+
166
+ Returns:
167
+ Loaded OgmaModel.
168
+ """
169
+ from pathlib import Path
170
+
171
+ import yaml
172
+
173
+ ckpt_path = Path(path)
174
+ with open(ckpt_path / "config.yaml") as f:
175
+ config_dict = yaml.safe_load(f)
176
+ config = OgmaConfig.from_dict(config_dict)
177
+
178
+ model = cls(config)
179
+ state_dict = torch.load(
180
+ ckpt_path / "model.pt",
181
+ map_location=device,
182
+ weights_only=True,
183
+ )
184
+ model.load_state_dict(state_dict)
185
+ model.to(device)
186
+ model.eval()
187
+ return model
188
+
189
+ def save_checkpoint(self, path: str) -> None:
190
+ """Save model checkpoint.
191
+
192
+ Args:
193
+ path: Directory to save config.yaml and model.pt.
194
+ """
195
+ from pathlib import Path
196
+
197
+ import yaml
198
+
199
+ ckpt_path = Path(path)
200
+ ckpt_path.mkdir(parents=True, exist_ok=True)
201
+ with open(ckpt_path / "config.yaml", "w") as f:
202
+ yaml.dump(self.config.to_dict(), f, default_flow_style=False)
203
+ torch.save(self.state_dict(), ckpt_path / "model.pt")
pooling.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Pooling strategies for Ogma."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import torch
6
+ import torch.nn as nn
7
+ import torch.nn.functional as F
8
+
9
+ from ogma.model.config import OgmaConfig, PoolingType
10
+
11
+ __all__ = [
12
+ "create_pooling",
13
+ "TaskTokenPooling",
14
+ "LatentAttentionPooling",
15
+ "MeanPooling",
16
+ ]
17
+
18
+
19
+ def create_pooling(config: OgmaConfig) -> nn.Module:
20
+ """Factory for pooling layers."""
21
+ if config.pooling == PoolingType.TASK_TOKEN:
22
+ return TaskTokenPooling()
23
+ elif config.pooling == PoolingType.LATENT_ATTENTION:
24
+ return LatentAttentionPooling(config.d_model)
25
+ elif config.pooling == PoolingType.MEAN:
26
+ return MeanPooling()
27
+ raise ValueError(f"Unknown pooling type: {config.pooling}")
28
+
29
+
30
+ class TaskTokenPooling(nn.Module):
31
+ """Use the output at position 0 (task token) as the sentence embedding."""
32
+
33
+ def forward(
34
+ self,
35
+ x: torch.Tensor,
36
+ attention_mask: torch.Tensor | None = None,
37
+ ) -> torch.Tensor:
38
+ """Extract task token output.
39
+
40
+ Args:
41
+ x: (B, S, D) sequence outputs.
42
+ attention_mask: unused, for interface compatibility.
43
+
44
+ Returns:
45
+ (B, D) pooled output.
46
+ """
47
+ return x[:, 0, :]
48
+
49
+
50
+ class LatentAttentionPooling(nn.Module):
51
+ """Learned query vector attends over all token outputs."""
52
+
53
+ def __init__(self, d_model: int) -> None:
54
+ super().__init__()
55
+ self.query = nn.Parameter(torch.randn(d_model))
56
+
57
+ def forward(
58
+ self,
59
+ x: torch.Tensor,
60
+ attention_mask: torch.Tensor | None = None,
61
+ ) -> torch.Tensor:
62
+ """Attend over sequence with learned query.
63
+
64
+ Args:
65
+ x: (B, S, D) sequence outputs.
66
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
67
+
68
+ Returns:
69
+ (B, D) pooled output.
70
+ """
71
+ # (B, S)
72
+ scores = torch.matmul(x, self.query) / (x.shape[-1] ** 0.5)
73
+ if attention_mask is not None:
74
+ scores = scores.masked_fill(attention_mask == 0, float("-inf"))
75
+ weights = F.softmax(scores, dim=-1) # (B, S)
76
+ return torch.bmm(weights.unsqueeze(1), x).squeeze(1) # (B, D)
77
+
78
+
79
+ class MeanPooling(nn.Module):
80
+ """Average all token outputs (excluding padding)."""
81
+
82
+ def forward(
83
+ self,
84
+ x: torch.Tensor,
85
+ attention_mask: torch.Tensor | None = None,
86
+ ) -> torch.Tensor:
87
+ """Mean pool over valid tokens.
88
+
89
+ Args:
90
+ x: (B, S, D) sequence outputs.
91
+ attention_mask: (B, S) mask where 1=valid, 0=pad.
92
+
93
+ Returns:
94
+ (B, D) pooled output.
95
+ """
96
+ if attention_mask is None:
97
+ return x.mean(dim=1)
98
+ mask = attention_mask.unsqueeze(-1).float() # (B, S, 1)
99
+ return (x * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-9)
results/AmazonCounterfactualClassification.json ADDED
@@ -0,0 +1,526 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1f7e6a9d6fa6e64c53d146e428565640410c0df1",
3
+ "task_name": "AmazonCounterfactualClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.660661,
11
+ "f1": 0.539131,
12
+ "f1_weighted": 0.728177,
13
+ "precision": 0.573326,
14
+ "precision_weighted": 0.878998,
15
+ "recall": 0.692049,
16
+ "recall_weighted": 0.660661,
17
+ "ap": 0.166466,
18
+ "ap_weighted": 0.166466
19
+ },
20
+ {
21
+ "accuracy": 0.645646,
22
+ "f1": 0.520741,
23
+ "f1_weighted": 0.71618,
24
+ "precision": 0.559846,
25
+ "precision_weighted": 0.868262,
26
+ "recall": 0.65719,
27
+ "recall_weighted": 0.645646,
28
+ "ap": 0.149728,
29
+ "ap_weighted": 0.149728
30
+ },
31
+ {
32
+ "accuracy": 0.668168,
33
+ "f1": 0.557558,
34
+ "f1_weighted": 0.734269,
35
+ "precision": 0.591513,
36
+ "precision_weighted": 0.895845,
37
+ "recall": 0.742618,
38
+ "recall_weighted": 0.668168,
39
+ "ap": 0.192479,
40
+ "ap_weighted": 0.192479
41
+ },
42
+ {
43
+ "accuracy": 0.611111,
44
+ "f1": 0.509608,
45
+ "f1_weighted": 0.687825,
46
+ "precision": 0.567568,
47
+ "precision_weighted": 0.88061,
48
+ "recall": 0.684387,
49
+ "recall_weighted": 0.611111,
50
+ "ap": 0.158868,
51
+ "ap_weighted": 0.158868
52
+ },
53
+ {
54
+ "accuracy": 0.678679,
55
+ "f1": 0.554128,
56
+ "f1_weighted": 0.74237,
57
+ "precision": 0.580928,
58
+ "precision_weighted": 0.883273,
59
+ "recall": 0.708694,
60
+ "recall_weighted": 0.678679,
61
+ "ap": 0.176592,
62
+ "ap_weighted": 0.176592
63
+ },
64
+ {
65
+ "accuracy": 0.60961,
66
+ "f1": 0.498029,
67
+ "f1_weighted": 0.687077,
68
+ "precision": 0.553143,
69
+ "precision_weighted": 0.866063,
70
+ "recall": 0.643784,
71
+ "recall_weighted": 0.60961,
72
+ "ap": 0.142346,
73
+ "ap_weighted": 0.142346
74
+ },
75
+ {
76
+ "accuracy": 0.593093,
77
+ "f1": 0.490376,
78
+ "f1_weighted": 0.673137,
79
+ "precision": 0.554044,
80
+ "precision_weighted": 0.868805,
81
+ "recall": 0.647858,
82
+ "recall_weighted": 0.593093,
83
+ "ap": 0.143155,
84
+ "ap_weighted": 0.143155
85
+ },
86
+ {
87
+ "accuracy": 0.65015,
88
+ "f1": 0.525848,
89
+ "f1_weighted": 0.719774,
90
+ "precision": 0.563412,
91
+ "precision_weighted": 0.871043,
92
+ "recall": 0.666322,
93
+ "recall_weighted": 0.65015,
94
+ "ap": 0.153943,
95
+ "ap_weighted": 0.153943
96
+ },
97
+ {
98
+ "accuracy": 0.668168,
99
+ "f1": 0.555774,
100
+ "f1_weighted": 0.734263,
101
+ "precision": 0.589241,
102
+ "precision_weighted": 0.89351,
103
+ "recall": 0.73599,
104
+ "recall_weighted": 0.668168,
105
+ "ap": 0.189038,
106
+ "ap_weighted": 0.189038
107
+ },
108
+ {
109
+ "accuracy": 0.638138,
110
+ "f1": 0.517518,
111
+ "f1_weighted": 0.710221,
112
+ "precision": 0.560216,
113
+ "precision_weighted": 0.869578,
114
+ "recall": 0.659644,
115
+ "recall_weighted": 0.638138,
116
+ "ap": 0.150261,
117
+ "ap_weighted": 0.150261
118
+ }
119
+ ],
120
+ "accuracy": 0.642342,
121
+ "f1": 0.526871,
122
+ "f1_weighted": 0.713329,
123
+ "precision": 0.569324,
124
+ "precision_weighted": 0.877599,
125
+ "recall": 0.683854,
126
+ "recall_weighted": 0.642342,
127
+ "ap": 0.162288,
128
+ "ap_weighted": 0.162288,
129
+ "main_score": 0.642342,
130
+ "hf_subset": "en-ext",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ },
135
+ {
136
+ "scores_per_experiment": [
137
+ {
138
+ "accuracy": 0.570149,
139
+ "f1": 0.519254,
140
+ "f1_weighted": 0.621512,
141
+ "precision": 0.571126,
142
+ "precision_weighted": 0.784947,
143
+ "recall": 0.624206,
144
+ "recall_weighted": 0.570149,
145
+ "ap": 0.223263,
146
+ "ap_weighted": 0.223263
147
+ },
148
+ {
149
+ "accuracy": 0.641791,
150
+ "f1": 0.577127,
151
+ "f1_weighted": 0.685229,
152
+ "precision": 0.601508,
153
+ "precision_weighted": 0.806672,
154
+ "recall": 0.674343,
155
+ "recall_weighted": 0.641791,
156
+ "ap": 0.256075,
157
+ "ap_weighted": 0.256075
158
+ },
159
+ {
160
+ "accuracy": 0.58209,
161
+ "f1": 0.511946,
162
+ "f1_weighted": 0.632902,
163
+ "precision": 0.548468,
164
+ "precision_weighted": 0.75884,
165
+ "recall": 0.583717,
166
+ "recall_weighted": 0.58209,
167
+ "ap": 0.204515,
168
+ "ap_weighted": 0.204515
169
+ },
170
+ {
171
+ "accuracy": 0.638806,
172
+ "f1": 0.57242,
173
+ "f1_weighted": 0.68256,
174
+ "precision": 0.596642,
175
+ "precision_weighted": 0.801838,
176
+ "recall": 0.665723,
177
+ "recall_weighted": 0.638806,
178
+ "ap": 0.250627,
179
+ "ap_weighted": 0.250627
180
+ },
181
+ {
182
+ "accuracy": 0.626866,
183
+ "f1": 0.567626,
184
+ "f1_weighted": 0.672251,
185
+ "precision": 0.599314,
186
+ "precision_weighted": 0.807376,
187
+ "recall": 0.672134,
188
+ "recall_weighted": 0.626866,
189
+ "ap": 0.253138,
190
+ "ap_weighted": 0.253138
191
+ },
192
+ {
193
+ "accuracy": 0.662687,
194
+ "f1": 0.556364,
195
+ "f1_weighted": 0.698343,
196
+ "precision": 0.563003,
197
+ "precision_weighted": 0.763284,
198
+ "recall": 0.598375,
199
+ "recall_weighted": 0.662687,
200
+ "ap": 0.214886,
201
+ "ap_weighted": 0.214886
202
+ },
203
+ {
204
+ "accuracy": 0.653731,
205
+ "f1": 0.571157,
206
+ "f1_weighted": 0.694176,
207
+ "precision": 0.584803,
208
+ "precision_weighted": 0.786074,
209
+ "recall": 0.64067,
210
+ "recall_weighted": 0.653731,
211
+ "ap": 0.237555,
212
+ "ap_weighted": 0.237555
213
+ },
214
+ {
215
+ "accuracy": 0.683582,
216
+ "f1": 0.591576,
217
+ "f1_weighted": 0.718301,
218
+ "precision": 0.59531,
219
+ "precision_weighted": 0.790579,
220
+ "recall": 0.651905,
221
+ "recall_weighted": 0.683582,
222
+ "ap": 0.247646,
223
+ "ap_weighted": 0.247646
224
+ },
225
+ {
226
+ "accuracy": 0.647761,
227
+ "f1": 0.579468,
228
+ "f1_weighted": 0.690255,
229
+ "precision": 0.600325,
230
+ "precision_weighted": 0.804022,
231
+ "recall": 0.671138,
232
+ "recall_weighted": 0.647761,
233
+ "ap": 0.25485,
234
+ "ap_weighted": 0.25485
235
+ },
236
+ {
237
+ "accuracy": 0.620896,
238
+ "f1": 0.532005,
239
+ "f1_weighted": 0.665341,
240
+ "precision": 0.552144,
241
+ "precision_weighted": 0.758398,
242
+ "recall": 0.586736,
243
+ "recall_weighted": 0.620896,
244
+ "ap": 0.207078,
245
+ "ap_weighted": 0.207078
246
+ }
247
+ ],
248
+ "accuracy": 0.632836,
249
+ "f1": 0.557894,
250
+ "f1_weighted": 0.676087,
251
+ "precision": 0.581264,
252
+ "precision_weighted": 0.786203,
253
+ "recall": 0.636895,
254
+ "recall_weighted": 0.632836,
255
+ "ap": 0.234963,
256
+ "ap_weighted": 0.234963,
257
+ "main_score": 0.632836,
258
+ "hf_subset": "en",
259
+ "languages": [
260
+ "eng-Latn"
261
+ ]
262
+ }
263
+ ],
264
+ "test": [
265
+ {
266
+ "scores_per_experiment": [
267
+ {
268
+ "accuracy": 0.696402,
269
+ "f1": 0.574988,
270
+ "f1_weighted": 0.75481,
271
+ "precision": 0.593841,
272
+ "precision_weighted": 0.886556,
273
+ "recall": 0.732004,
274
+ "recall_weighted": 0.696402,
275
+ "ap": 0.197333,
276
+ "ap_weighted": 0.197333
277
+ },
278
+ {
279
+ "accuracy": 0.666417,
280
+ "f1": 0.54214,
281
+ "f1_weighted": 0.730969,
282
+ "precision": 0.57182,
283
+ "precision_weighted": 0.870904,
284
+ "recall": 0.680302,
285
+ "recall_weighted": 0.666417,
286
+ "ap": 0.166866,
287
+ "ap_weighted": 0.166866
288
+ },
289
+ {
290
+ "accuracy": 0.670915,
291
+ "f1": 0.557661,
292
+ "f1_weighted": 0.73484,
293
+ "precision": 0.587956,
294
+ "precision_weighted": 0.88593,
295
+ "recall": 0.724135,
296
+ "recall_weighted": 0.670915,
297
+ "ap": 0.189144,
298
+ "ap_weighted": 0.189144
299
+ },
300
+ {
301
+ "accuracy": 0.598951,
302
+ "f1": 0.498513,
303
+ "f1_weighted": 0.676171,
304
+ "precision": 0.558633,
305
+ "precision_weighted": 0.86715,
306
+ "recall": 0.65536,
307
+ "recall_weighted": 0.598951,
308
+ "ap": 0.151209,
309
+ "ap_weighted": 0.151209
310
+ },
311
+ {
312
+ "accuracy": 0.682159,
313
+ "f1": 0.562336,
314
+ "f1_weighted": 0.743615,
315
+ "precision": 0.586927,
316
+ "precision_weighted": 0.882707,
317
+ "recall": 0.717697,
318
+ "recall_weighted": 0.682159,
319
+ "ap": 0.187383,
320
+ "ap_weighted": 0.187383
321
+ },
322
+ {
323
+ "accuracy": 0.632684,
324
+ "f1": 0.526379,
325
+ "f1_weighted": 0.704002,
326
+ "precision": 0.572738,
327
+ "precision_weighted": 0.877078,
328
+ "recall": 0.690082,
329
+ "recall_weighted": 0.632684,
330
+ "ap": 0.168316,
331
+ "ap_weighted": 0.168316
332
+ },
333
+ {
334
+ "accuracy": 0.642429,
335
+ "f1": 0.533331,
336
+ "f1_weighted": 0.711947,
337
+ "precision": 0.575318,
338
+ "precision_weighted": 0.878178,
339
+ "recall": 0.695521,
340
+ "recall_weighted": 0.642429,
341
+ "ap": 0.17171,
342
+ "ap_weighted": 0.17171
343
+ },
344
+ {
345
+ "accuracy": 0.681409,
346
+ "f1": 0.558913,
347
+ "f1_weighted": 0.742919,
348
+ "precision": 0.583297,
349
+ "precision_weighted": 0.879316,
350
+ "recall": 0.707742,
351
+ "recall_weighted": 0.681409,
352
+ "ap": 0.182116,
353
+ "ap_weighted": 0.182116
354
+ },
355
+ {
356
+ "accuracy": 0.708396,
357
+ "f1": 0.590859,
358
+ "f1_weighted": 0.764452,
359
+ "precision": 0.60574,
360
+ "precision_weighted": 0.895591,
361
+ "recall": 0.760949,
362
+ "recall_weighted": 0.708396,
363
+ "ap": 0.216207,
364
+ "ap_weighted": 0.216207
365
+ },
366
+ {
367
+ "accuracy": 0.68066,
368
+ "f1": 0.560272,
369
+ "f1_weighted": 0.742406,
370
+ "precision": 0.585323,
371
+ "precision_weighted": 0.881427,
372
+ "recall": 0.713681,
373
+ "recall_weighted": 0.68066,
374
+ "ap": 0.185078,
375
+ "ap_weighted": 0.185078
376
+ }
377
+ ],
378
+ "accuracy": 0.666042,
379
+ "f1": 0.550539,
380
+ "f1_weighted": 0.730613,
381
+ "precision": 0.582159,
382
+ "precision_weighted": 0.880484,
383
+ "recall": 0.707747,
384
+ "recall_weighted": 0.666042,
385
+ "ap": 0.181536,
386
+ "ap_weighted": 0.181536,
387
+ "main_score": 0.666042,
388
+ "hf_subset": "en-ext",
389
+ "languages": [
390
+ "eng-Latn"
391
+ ]
392
+ },
393
+ {
394
+ "scores_per_experiment": [
395
+ {
396
+ "accuracy": 0.616418,
397
+ "f1": 0.571465,
398
+ "f1_weighted": 0.655985,
399
+ "precision": 0.60476,
400
+ "precision_weighted": 0.787698,
401
+ "recall": 0.666254,
402
+ "recall_weighted": 0.616418,
403
+ "ap": 0.276934,
404
+ "ap_weighted": 0.276934
405
+ },
406
+ {
407
+ "accuracy": 0.689552,
408
+ "f1": 0.629362,
409
+ "f1_weighted": 0.720316,
410
+ "precision": 0.634798,
411
+ "precision_weighted": 0.804039,
412
+ "recall": 0.70593,
413
+ "recall_weighted": 0.689552,
414
+ "ap": 0.313767,
415
+ "ap_weighted": 0.313767
416
+ },
417
+ {
418
+ "accuracy": 0.643284,
419
+ "f1": 0.588222,
420
+ "f1_weighted": 0.679916,
421
+ "precision": 0.607631,
422
+ "precision_weighted": 0.785023,
423
+ "recall": 0.668505,
424
+ "recall_weighted": 0.643284,
425
+ "ap": 0.281284,
426
+ "ap_weighted": 0.281284
427
+ },
428
+ {
429
+ "accuracy": 0.662687,
430
+ "f1": 0.609134,
431
+ "f1_weighted": 0.697237,
432
+ "precision": 0.624849,
433
+ "precision_weighted": 0.800277,
434
+ "recall": 0.695011,
435
+ "recall_weighted": 0.662687,
436
+ "ap": 0.301188,
437
+ "ap_weighted": 0.301188
438
+ },
439
+ {
440
+ "accuracy": 0.646269,
441
+ "f1": 0.596647,
442
+ "f1_weighted": 0.682799,
443
+ "precision": 0.619113,
444
+ "precision_weighted": 0.797814,
445
+ "recall": 0.687696,
446
+ "recall_weighted": 0.646269,
447
+ "ap": 0.293869,
448
+ "ap_weighted": 0.293869
449
+ },
450
+ {
451
+ "accuracy": 0.695522,
452
+ "f1": 0.631093,
453
+ "f1_weighted": 0.724976,
454
+ "precision": 0.633308,
455
+ "precision_weighted": 0.800305,
456
+ "recall": 0.700973,
457
+ "recall_weighted": 0.695522,
458
+ "ap": 0.311631,
459
+ "ap_weighted": 0.311631
460
+ },
461
+ {
462
+ "accuracy": 0.720896,
463
+ "f1": 0.637743,
464
+ "f1_weighted": 0.743432,
465
+ "precision": 0.63047,
466
+ "precision_weighted": 0.788135,
467
+ "recall": 0.682073,
468
+ "recall_weighted": 0.720896,
469
+ "ap": 0.30437,
470
+ "ap_weighted": 0.30437
471
+ },
472
+ {
473
+ "accuracy": 0.698507,
474
+ "f1": 0.624156,
475
+ "f1_weighted": 0.725953,
476
+ "precision": 0.622735,
477
+ "precision_weighted": 0.787177,
478
+ "recall": 0.679715,
479
+ "recall_weighted": 0.698507,
480
+ "ap": 0.297506,
481
+ "ap_weighted": 0.297506
482
+ },
483
+ {
484
+ "accuracy": 0.610448,
485
+ "f1": 0.55892,
486
+ "f1_weighted": 0.650725,
487
+ "precision": 0.588217,
488
+ "precision_weighted": 0.769621,
489
+ "recall": 0.63943,
490
+ "recall_weighted": 0.610448,
491
+ "ap": 0.260652,
492
+ "ap_weighted": 0.260652
493
+ },
494
+ {
495
+ "accuracy": 0.608955,
496
+ "f1": 0.551393,
497
+ "f1_weighted": 0.649249,
498
+ "precision": 0.577126,
499
+ "precision_weighted": 0.757492,
500
+ "recall": 0.621167,
501
+ "recall_weighted": 0.608955,
502
+ "ap": 0.250292,
503
+ "ap_weighted": 0.250292
504
+ }
505
+ ],
506
+ "accuracy": 0.659254,
507
+ "f1": 0.599813,
508
+ "f1_weighted": 0.693059,
509
+ "precision": 0.614301,
510
+ "precision_weighted": 0.787758,
511
+ "recall": 0.674675,
512
+ "recall_weighted": 0.659254,
513
+ "ap": 0.289149,
514
+ "ap_weighted": 0.289149,
515
+ "main_score": 0.659254,
516
+ "hf_subset": "en",
517
+ "languages": [
518
+ "eng-Latn"
519
+ ]
520
+ }
521
+ ]
522
+ },
523
+ "evaluation_time": 32.70637917518616,
524
+ "kg_co2_emissions": null,
525
+ "date": null
526
+ }
results/AmazonPolarityClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e2d317d38cd51312af73b3d32a06d1a08b442046",
3
+ "task_name": "AmazonPolarityClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.763077,
11
+ "f1": 0.762679,
12
+ "f1_weighted": 0.762679,
13
+ "precision": 0.764854,
14
+ "precision_weighted": 0.764854,
15
+ "recall": 0.763078,
16
+ "recall_weighted": 0.763077,
17
+ "ap": 0.706923,
18
+ "ap_weighted": 0.706923
19
+ },
20
+ {
21
+ "accuracy": 0.64422,
22
+ "f1": 0.643354,
23
+ "f1_weighted": 0.643354,
24
+ "precision": 0.645634,
25
+ "precision_weighted": 0.645634,
26
+ "recall": 0.64422,
27
+ "recall_weighted": 0.64422,
28
+ "ap": 0.591044,
29
+ "ap_weighted": 0.591044
30
+ },
31
+ {
32
+ "accuracy": 0.731215,
33
+ "f1": 0.730616,
34
+ "f1_weighted": 0.730616,
35
+ "precision": 0.733289,
36
+ "precision_weighted": 0.733289,
37
+ "recall": 0.731215,
38
+ "recall_weighted": 0.731215,
39
+ "ap": 0.664462,
40
+ "ap_weighted": 0.664462
41
+ },
42
+ {
43
+ "accuracy": 0.68851,
44
+ "f1": 0.677693,
45
+ "f1_weighted": 0.677693,
46
+ "precision": 0.71774,
47
+ "precision_weighted": 0.71774,
48
+ "recall": 0.68851,
49
+ "recall_weighted": 0.68851,
50
+ "ap": 0.65034,
51
+ "ap_weighted": 0.65034
52
+ },
53
+ {
54
+ "accuracy": 0.752705,
55
+ "f1": 0.75231,
56
+ "f1_weighted": 0.75231,
57
+ "precision": 0.754326,
58
+ "precision_weighted": 0.754326,
59
+ "recall": 0.752705,
60
+ "recall_weighted": 0.752705,
61
+ "ap": 0.695753,
62
+ "ap_weighted": 0.695753
63
+ },
64
+ {
65
+ "accuracy": 0.668285,
66
+ "f1": 0.667893,
67
+ "f1_weighted": 0.667893,
68
+ "precision": 0.669083,
69
+ "precision_weighted": 0.669083,
70
+ "recall": 0.668285,
71
+ "recall_weighted": 0.668285,
72
+ "ap": 0.614551,
73
+ "ap_weighted": 0.614551
74
+ },
75
+ {
76
+ "accuracy": 0.69632,
77
+ "f1": 0.695857,
78
+ "f1_weighted": 0.695857,
79
+ "precision": 0.697522,
80
+ "precision_weighted": 0.697522,
81
+ "recall": 0.69632,
82
+ "recall_weighted": 0.69632,
83
+ "ap": 0.633913,
84
+ "ap_weighted": 0.633913
85
+ },
86
+ {
87
+ "accuracy": 0.743217,
88
+ "f1": 0.739118,
89
+ "f1_weighted": 0.739118,
90
+ "precision": 0.75953,
91
+ "precision_weighted": 0.75953,
92
+ "recall": 0.743217,
93
+ "recall_weighted": 0.743217,
94
+ "ap": 0.700556,
95
+ "ap_weighted": 0.700556
96
+ },
97
+ {
98
+ "accuracy": 0.6574,
99
+ "f1": 0.652781,
100
+ "f1_weighted": 0.652781,
101
+ "precision": 0.666247,
102
+ "precision_weighted": 0.666247,
103
+ "recall": 0.6574,
104
+ "recall_weighted": 0.6574,
105
+ "ap": 0.598831,
106
+ "ap_weighted": 0.598831
107
+ },
108
+ {
109
+ "accuracy": 0.699477,
110
+ "f1": 0.699372,
111
+ "f1_weighted": 0.699372,
112
+ "precision": 0.699757,
113
+ "precision_weighted": 0.699757,
114
+ "recall": 0.699477,
115
+ "recall_weighted": 0.699477,
116
+ "ap": 0.638095,
117
+ "ap_weighted": 0.638095
118
+ }
119
+ ],
120
+ "accuracy": 0.704443,
121
+ "f1": 0.702168,
122
+ "f1_weighted": 0.702168,
123
+ "precision": 0.710798,
124
+ "precision_weighted": 0.710798,
125
+ "recall": 0.704443,
126
+ "recall_weighted": 0.704443,
127
+ "ap": 0.649447,
128
+ "ap_weighted": 0.649447,
129
+ "main_score": 0.704443,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 5498.067716121674,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/AmazonReviewsClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6b5d328eaae8ef408dd7d775040245cf86f92e9d",
3
+ "task_name": "AmazonReviewsClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.3768,
11
+ "f1": 0.366297,
12
+ "f1_weighted": 0.366297,
13
+ "precision": 0.37297,
14
+ "precision_weighted": 0.37297,
15
+ "recall": 0.3768,
16
+ "recall_weighted": 0.3768,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.3854,
22
+ "f1": 0.366427,
23
+ "f1_weighted": 0.366427,
24
+ "precision": 0.369938,
25
+ "precision_weighted": 0.369938,
26
+ "recall": 0.3854,
27
+ "recall_weighted": 0.3854,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.3892,
33
+ "f1": 0.383804,
34
+ "f1_weighted": 0.383804,
35
+ "precision": 0.382414,
36
+ "precision_weighted": 0.382414,
37
+ "recall": 0.3892,
38
+ "recall_weighted": 0.3892,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.3474,
44
+ "f1": 0.340145,
45
+ "f1_weighted": 0.340145,
46
+ "precision": 0.335727,
47
+ "precision_weighted": 0.335727,
48
+ "recall": 0.3474,
49
+ "recall_weighted": 0.3474,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.3868,
55
+ "f1": 0.380865,
56
+ "f1_weighted": 0.380865,
57
+ "precision": 0.393436,
58
+ "precision_weighted": 0.393436,
59
+ "recall": 0.3868,
60
+ "recall_weighted": 0.3868,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.35,
66
+ "f1": 0.33815,
67
+ "f1_weighted": 0.33815,
68
+ "precision": 0.351434,
69
+ "precision_weighted": 0.351434,
70
+ "recall": 0.35,
71
+ "recall_weighted": 0.35,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.3516,
77
+ "f1": 0.347711,
78
+ "f1_weighted": 0.347711,
79
+ "precision": 0.349063,
80
+ "precision_weighted": 0.349063,
81
+ "recall": 0.3516,
82
+ "recall_weighted": 0.3516,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.4044,
88
+ "f1": 0.39977,
89
+ "f1_weighted": 0.39977,
90
+ "precision": 0.400172,
91
+ "precision_weighted": 0.400172,
92
+ "recall": 0.4044,
93
+ "recall_weighted": 0.4044,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.3684,
99
+ "f1": 0.357162,
100
+ "f1_weighted": 0.357162,
101
+ "precision": 0.366983,
102
+ "precision_weighted": 0.366983,
103
+ "recall": 0.3684,
104
+ "recall_weighted": 0.3684,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.3504,
110
+ "f1": 0.342682,
111
+ "f1_weighted": 0.342682,
112
+ "precision": 0.345879,
113
+ "precision_weighted": 0.345879,
114
+ "recall": 0.3504,
115
+ "recall_weighted": 0.3504,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.37104,
121
+ "f1": 0.362301,
122
+ "f1_weighted": 0.362301,
123
+ "precision": 0.366802,
124
+ "precision_weighted": 0.366802,
125
+ "recall": 0.37104,
126
+ "recall_weighted": 0.37104,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.37104,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.3848,
141
+ "f1": 0.371192,
142
+ "f1_weighted": 0.371192,
143
+ "precision": 0.374745,
144
+ "precision_weighted": 0.374745,
145
+ "recall": 0.3848,
146
+ "recall_weighted": 0.3848,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.3922,
152
+ "f1": 0.373858,
153
+ "f1_weighted": 0.373858,
154
+ "precision": 0.378077,
155
+ "precision_weighted": 0.378077,
156
+ "recall": 0.3922,
157
+ "recall_weighted": 0.3922,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.3836,
163
+ "f1": 0.377258,
164
+ "f1_weighted": 0.377258,
165
+ "precision": 0.375583,
166
+ "precision_weighted": 0.375583,
167
+ "recall": 0.3836,
168
+ "recall_weighted": 0.3836,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.363,
174
+ "f1": 0.356521,
175
+ "f1_weighted": 0.356521,
176
+ "precision": 0.352648,
177
+ "precision_weighted": 0.352648,
178
+ "recall": 0.363,
179
+ "recall_weighted": 0.363,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.397,
185
+ "f1": 0.389545,
186
+ "f1_weighted": 0.389545,
187
+ "precision": 0.400318,
188
+ "precision_weighted": 0.400318,
189
+ "recall": 0.397,
190
+ "recall_weighted": 0.397,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.3486,
196
+ "f1": 0.333875,
197
+ "f1_weighted": 0.333875,
198
+ "precision": 0.349267,
199
+ "precision_weighted": 0.349267,
200
+ "recall": 0.3486,
201
+ "recall_weighted": 0.3486,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.3398,
207
+ "f1": 0.336773,
208
+ "f1_weighted": 0.336773,
209
+ "precision": 0.337682,
210
+ "precision_weighted": 0.337682,
211
+ "recall": 0.3398,
212
+ "recall_weighted": 0.3398,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.4048,
218
+ "f1": 0.399556,
219
+ "f1_weighted": 0.399556,
220
+ "precision": 0.399706,
221
+ "precision_weighted": 0.399706,
222
+ "recall": 0.4048,
223
+ "recall_weighted": 0.4048,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.365,
229
+ "f1": 0.353462,
230
+ "f1_weighted": 0.353462,
231
+ "precision": 0.366623,
232
+ "precision_weighted": 0.366623,
233
+ "recall": 0.365,
234
+ "recall_weighted": 0.365,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.3546,
240
+ "f1": 0.346895,
241
+ "f1_weighted": 0.346895,
242
+ "precision": 0.350111,
243
+ "precision_weighted": 0.350111,
244
+ "recall": 0.3546,
245
+ "recall_weighted": 0.3546,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.37334,
251
+ "f1": 0.363894,
252
+ "f1_weighted": 0.363894,
253
+ "precision": 0.368476,
254
+ "precision_weighted": 0.368476,
255
+ "recall": 0.37334,
256
+ "recall_weighted": 0.37334,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.37334,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 159.81794357299805,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/ArXivHierarchicalClusteringP2P.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0bbdb47bcbe3a90093699aefeed338a0f28a7ee8",
3
+ "task_name": "ArXivHierarchicalClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.512613,
11
+ 0.540229,
12
+ 0.510996,
13
+ 0.527727,
14
+ 0.479675,
15
+ 0.526184,
16
+ 0.489894,
17
+ 0.509987,
18
+ 0.53079,
19
+ 0.539218
20
+ ],
21
+ "Level 1": [
22
+ 0.565661,
23
+ 0.583839,
24
+ 0.541221,
25
+ 0.561374,
26
+ 0.597025,
27
+ 0.559242,
28
+ 0.588538,
29
+ 0.558389,
30
+ 0.564837,
31
+ 0.580209
32
+ ]
33
+ },
34
+ "v_measure": 0.543382,
35
+ "v_measure_std": 0.03196,
36
+ "main_score": 0.543382,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.419024705886841,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArXivHierarchicalClusteringS2S.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b73bd54100e5abfa6e3a23dcafb46fe4d2438dc3",
3
+ "task_name": "ArXivHierarchicalClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measures": {
9
+ "Level 0": [
10
+ 0.468766,
11
+ 0.451474,
12
+ 0.469421,
13
+ 0.464106,
14
+ 0.472935,
15
+ 0.429447,
16
+ 0.418673,
17
+ 0.412622,
18
+ 0.415958,
19
+ 0.420996
20
+ ],
21
+ "Level 1": [
22
+ 0.544882,
23
+ 0.557055,
24
+ 0.558996,
25
+ 0.526886,
26
+ 0.595451,
27
+ 0.561838,
28
+ 0.528565,
29
+ 0.563742,
30
+ 0.534648,
31
+ 0.579583
32
+ ]
33
+ },
34
+ "v_measure": 0.498802,
35
+ "v_measure_std": 0.060667,
36
+ "main_score": 0.498802,
37
+ "hf_subset": "default",
38
+ "languages": [
39
+ "eng-Latn"
40
+ ]
41
+ }
42
+ ]
43
+ },
44
+ "evaluation_time": 2.2445287704467773,
45
+ "kg_co2_emissions": null,
46
+ "date": null
47
+ }
results/ArguAna.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c22ab2a51041ffd869aaddef7af8d8215647e41a",
3
+ "task_name": "ArguAna",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.17354,
9
+ "ndcg_at_3": 0.28785,
10
+ "ndcg_at_5": 0.34553,
11
+ "ndcg_at_10": 0.40718,
12
+ "ndcg_at_20": 0.44012,
13
+ "ndcg_at_100": 0.46442,
14
+ "ndcg_at_1000": 0.46886,
15
+ "map_at_1": 0.17354,
16
+ "map_at_3": 0.25889,
17
+ "map_at_5": 0.29072,
18
+ "map_at_10": 0.31617,
19
+ "map_at_20": 0.32532,
20
+ "map_at_100": 0.32888,
21
+ "map_at_1000": 0.32906,
22
+ "recall_at_1": 0.17354,
23
+ "recall_at_3": 0.37198,
24
+ "recall_at_5": 0.5128,
25
+ "recall_at_10": 0.70341,
26
+ "recall_at_20": 0.83286,
27
+ "recall_at_100": 0.96088,
28
+ "recall_at_1000": 0.99502,
29
+ "accuracy": 0.17354,
30
+ "precision_at_1": 0.17354,
31
+ "precision_at_3": 0.12399,
32
+ "precision_at_5": 0.10256,
33
+ "precision_at_10": 0.07034,
34
+ "precision_at_20": 0.04164,
35
+ "precision_at_100": 0.00961,
36
+ "precision_at_1000": 0.001,
37
+ "mrr_at_1": 0.179232,
38
+ "mrr_at_3": 0.260313,
39
+ "mrr_at_5": 0.29271,
40
+ "mrr_at_10": 0.318369,
41
+ "mrr_at_20": 0.327558,
42
+ "mrr_at_100": 0.331084,
43
+ "mrr_at_1000": 0.331265,
44
+ "nauc_ndcg_at_1_max": -0.013551,
45
+ "nauc_ndcg_at_1_std": -0.073633,
46
+ "nauc_ndcg_at_1_diff1": 0.169343,
47
+ "nauc_ndcg_at_3_max": 0.01689,
48
+ "nauc_ndcg_at_3_std": -0.073368,
49
+ "nauc_ndcg_at_3_diff1": 0.116024,
50
+ "nauc_ndcg_at_5_max": 0.047505,
51
+ "nauc_ndcg_at_5_std": -0.047867,
52
+ "nauc_ndcg_at_5_diff1": 0.114828,
53
+ "nauc_ndcg_at_10_max": 0.037439,
54
+ "nauc_ndcg_at_10_std": -0.062884,
55
+ "nauc_ndcg_at_10_diff1": 0.120355,
56
+ "nauc_ndcg_at_20_max": 0.053057,
57
+ "nauc_ndcg_at_20_std": -0.044995,
58
+ "nauc_ndcg_at_20_diff1": 0.121774,
59
+ "nauc_ndcg_at_100_max": 0.046061,
60
+ "nauc_ndcg_at_100_std": -0.03742,
61
+ "nauc_ndcg_at_100_diff1": 0.127361,
62
+ "nauc_ndcg_at_1000_max": 0.037884,
63
+ "nauc_ndcg_at_1000_std": -0.049334,
64
+ "nauc_ndcg_at_1000_diff1": 0.127548,
65
+ "nauc_map_at_1_max": -0.013551,
66
+ "nauc_map_at_1_std": -0.073633,
67
+ "nauc_map_at_1_diff1": 0.169343,
68
+ "nauc_map_at_3_max": 0.011268,
69
+ "nauc_map_at_3_std": -0.071715,
70
+ "nauc_map_at_3_diff1": 0.12683,
71
+ "nauc_map_at_5_max": 0.028663,
72
+ "nauc_map_at_5_std": -0.057268,
73
+ "nauc_map_at_5_diff1": 0.126096,
74
+ "nauc_map_at_10_max": 0.023778,
75
+ "nauc_map_at_10_std": -0.062768,
76
+ "nauc_map_at_10_diff1": 0.128495,
77
+ "nauc_map_at_20_max": 0.027804,
78
+ "nauc_map_at_20_std": -0.058239,
79
+ "nauc_map_at_20_diff1": 0.129442,
80
+ "nauc_map_at_100_max": 0.026659,
81
+ "nauc_map_at_100_std": -0.057574,
82
+ "nauc_map_at_100_diff1": 0.130433,
83
+ "nauc_map_at_1000_max": 0.026377,
84
+ "nauc_map_at_1000_std": -0.057975,
85
+ "nauc_map_at_1000_diff1": 0.130476,
86
+ "nauc_recall_at_1_max": -0.013551,
87
+ "nauc_recall_at_1_std": -0.073633,
88
+ "nauc_recall_at_1_diff1": 0.169343,
89
+ "nauc_recall_at_3_max": 0.030515,
90
+ "nauc_recall_at_3_std": -0.078037,
91
+ "nauc_recall_at_3_diff1": 0.089463,
92
+ "nauc_recall_at_5_max": 0.099199,
93
+ "nauc_recall_at_5_std": -0.021047,
94
+ "nauc_recall_at_5_diff1": 0.086376,
95
+ "nauc_recall_at_10_max": 0.082779,
96
+ "nauc_recall_at_10_std": -0.069152,
97
+ "nauc_recall_at_10_diff1": 0.097125,
98
+ "nauc_recall_at_20_max": 0.20685,
99
+ "nauc_recall_at_20_std": 0.043333,
100
+ "nauc_recall_at_20_diff1": 0.086727,
101
+ "nauc_recall_at_100_max": 0.445572,
102
+ "nauc_recall_at_100_std": 0.565729,
103
+ "nauc_recall_at_100_diff1": 0.134238,
104
+ "nauc_recall_at_1000_max": 0.195658,
105
+ "nauc_recall_at_1000_std": 0.343883,
106
+ "nauc_recall_at_1000_diff1": 0.159035,
107
+ "nauc_precision_at_1_max": -0.013551,
108
+ "nauc_precision_at_1_std": -0.073633,
109
+ "nauc_precision_at_1_diff1": 0.169343,
110
+ "nauc_precision_at_3_max": 0.030515,
111
+ "nauc_precision_at_3_std": -0.078037,
112
+ "nauc_precision_at_3_diff1": 0.089463,
113
+ "nauc_precision_at_5_max": 0.099199,
114
+ "nauc_precision_at_5_std": -0.021047,
115
+ "nauc_precision_at_5_diff1": 0.086376,
116
+ "nauc_precision_at_10_max": 0.082779,
117
+ "nauc_precision_at_10_std": -0.069152,
118
+ "nauc_precision_at_10_diff1": 0.097125,
119
+ "nauc_precision_at_20_max": 0.20685,
120
+ "nauc_precision_at_20_std": 0.043333,
121
+ "nauc_precision_at_20_diff1": 0.086727,
122
+ "nauc_precision_at_100_max": 0.445572,
123
+ "nauc_precision_at_100_std": 0.565729,
124
+ "nauc_precision_at_100_diff1": 0.134238,
125
+ "nauc_precision_at_1000_max": 0.195658,
126
+ "nauc_precision_at_1000_std": 0.343883,
127
+ "nauc_precision_at_1000_diff1": 0.159035,
128
+ "nauc_mrr_at_1_max": -0.008785,
129
+ "nauc_mrr_at_1_std": -0.074216,
130
+ "nauc_mrr_at_1_diff1": 0.143714,
131
+ "nauc_mrr_at_3_max": 0.003046,
132
+ "nauc_mrr_at_3_std": -0.072639,
133
+ "nauc_mrr_at_3_diff1": 0.106587,
134
+ "nauc_mrr_at_5_max": 0.021046,
135
+ "nauc_mrr_at_5_std": -0.058279,
136
+ "nauc_mrr_at_5_diff1": 0.105512,
137
+ "nauc_mrr_at_10_max": 0.018121,
138
+ "nauc_mrr_at_10_std": -0.063311,
139
+ "nauc_mrr_at_10_diff1": 0.109908,
140
+ "nauc_mrr_at_20_max": 0.021849,
141
+ "nauc_mrr_at_20_std": -0.058931,
142
+ "nauc_mrr_at_20_diff1": 0.110445,
143
+ "nauc_mrr_at_100_max": 0.020649,
144
+ "nauc_mrr_at_100_std": -0.058184,
145
+ "nauc_mrr_at_100_diff1": 0.111046,
146
+ "nauc_mrr_at_1000_max": 0.020366,
147
+ "nauc_mrr_at_1000_std": -0.058582,
148
+ "nauc_mrr_at_1000_diff1": 0.111075,
149
+ "hit_rate_at_1": 0.17354,
150
+ "hit_rate_at_3": 0.37198,
151
+ "hit_rate_at_5": 0.5128,
152
+ "hit_rate_at_10": 0.70341,
153
+ "hit_rate_at_20": 0.83286,
154
+ "hit_rate_at_100": 0.96088,
155
+ "hit_rate_at_1000": 0.99502,
156
+ "main_score": 0.40718,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 14.623368501663208,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/AskUbuntuDupQuestions.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5691e3c48741d5f83b5cc8e630653d7a8cfc048",
3
+ "task_name": "AskUbuntuDupQuestions",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.48199,
9
+ "ndcg_at_3": 0.48315,
10
+ "ndcg_at_5": 0.49927,
11
+ "ndcg_at_10": 0.57062,
12
+ "ndcg_at_20": 0.69815,
13
+ "ndcg_at_100": 0.69815,
14
+ "ndcg_at_1000": 0.69815,
15
+ "map_at_1": 0.11388,
16
+ "map_at_3": 0.23449,
17
+ "map_at_5": 0.29979,
18
+ "map_at_10": 0.40644,
19
+ "map_at_20": 0.52129,
20
+ "map_at_100": 0.52129,
21
+ "map_at_1000": 0.52129,
22
+ "recall_at_1": 0.11388,
23
+ "recall_at_3": 0.29423,
24
+ "recall_at_5": 0.42158,
25
+ "recall_at_10": 0.67167,
26
+ "recall_at_20": 1.0,
27
+ "recall_at_100": 1.0,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.11388,
30
+ "precision_at_1": 0.48199,
31
+ "precision_at_3": 0.43767,
32
+ "precision_at_5": 0.40111,
33
+ "precision_at_10": 0.34432,
34
+ "precision_at_20": 0.27355,
35
+ "precision_at_100": 0.05471,
36
+ "precision_at_1000": 0.00547,
37
+ "mrr_at_1": 0.481994,
38
+ "mrr_at_3": 0.5988,
39
+ "mrr_at_5": 0.625808,
40
+ "mrr_at_10": 0.637189,
41
+ "mrr_at_20": 0.640963,
42
+ "mrr_at_100": 0.640963,
43
+ "mrr_at_1000": 0.640963,
44
+ "nauc_ndcg_at_1_max": 0.011189,
45
+ "nauc_ndcg_at_1_std": 0.113973,
46
+ "nauc_ndcg_at_1_diff1": 0.031937,
47
+ "nauc_ndcg_at_3_max": 0.011681,
48
+ "nauc_ndcg_at_3_std": 0.082992,
49
+ "nauc_ndcg_at_3_diff1": 0.0351,
50
+ "nauc_ndcg_at_5_max": -0.025624,
51
+ "nauc_ndcg_at_5_std": 0.08281,
52
+ "nauc_ndcg_at_5_diff1": 0.043169,
53
+ "nauc_ndcg_at_10_max": -0.017931,
54
+ "nauc_ndcg_at_10_std": 0.096634,
55
+ "nauc_ndcg_at_10_diff1": 0.038631,
56
+ "nauc_ndcg_at_20_max": 0.051651,
57
+ "nauc_ndcg_at_20_std": 0.110323,
58
+ "nauc_ndcg_at_20_diff1": 0.034889,
59
+ "nauc_ndcg_at_100_max": 0.051651,
60
+ "nauc_ndcg_at_100_std": 0.110323,
61
+ "nauc_ndcg_at_100_diff1": 0.034889,
62
+ "nauc_ndcg_at_1000_max": 0.051651,
63
+ "nauc_ndcg_at_1000_std": 0.110323,
64
+ "nauc_ndcg_at_1000_diff1": 0.034889,
65
+ "nauc_map_at_1_max": -0.117219,
66
+ "nauc_map_at_1_std": 0.031894,
67
+ "nauc_map_at_1_diff1": 0.130679,
68
+ "nauc_map_at_3_max": -0.099675,
69
+ "nauc_map_at_3_std": 0.063959,
70
+ "nauc_map_at_3_diff1": 0.12248,
71
+ "nauc_map_at_5_max": -0.108276,
72
+ "nauc_map_at_5_std": 0.07835,
73
+ "nauc_map_at_5_diff1": 0.099291,
74
+ "nauc_map_at_10_max": -0.050403,
75
+ "nauc_map_at_10_std": 0.096678,
76
+ "nauc_map_at_10_diff1": 0.058421,
77
+ "nauc_map_at_20_max": 0.021846,
78
+ "nauc_map_at_20_std": 0.100847,
79
+ "nauc_map_at_20_diff1": 0.036513,
80
+ "nauc_map_at_100_max": 0.021846,
81
+ "nauc_map_at_100_std": 0.100847,
82
+ "nauc_map_at_100_diff1": 0.036513,
83
+ "nauc_map_at_1000_max": 0.021846,
84
+ "nauc_map_at_1000_std": 0.100847,
85
+ "nauc_map_at_1000_diff1": 0.036513,
86
+ "nauc_recall_at_1_max": -0.117219,
87
+ "nauc_recall_at_1_std": 0.031894,
88
+ "nauc_recall_at_1_diff1": 0.130679,
89
+ "nauc_recall_at_3_max": -0.095026,
90
+ "nauc_recall_at_3_std": 0.015458,
91
+ "nauc_recall_at_3_diff1": 0.10148,
92
+ "nauc_recall_at_5_max": -0.142827,
93
+ "nauc_recall_at_5_std": 0.016437,
94
+ "nauc_recall_at_5_diff1": 0.084308,
95
+ "nauc_recall_at_10_max": -0.131389,
96
+ "nauc_recall_at_10_std": 0.021357,
97
+ "nauc_recall_at_10_diff1": 0.044322,
98
+ "nauc_recall_at_20_max": NaN,
99
+ "nauc_recall_at_20_std": NaN,
100
+ "nauc_recall_at_20_diff1": NaN,
101
+ "nauc_recall_at_100_max": NaN,
102
+ "nauc_recall_at_100_std": NaN,
103
+ "nauc_recall_at_100_diff1": NaN,
104
+ "nauc_recall_at_1000_max": NaN,
105
+ "nauc_recall_at_1000_std": NaN,
106
+ "nauc_recall_at_1000_diff1": NaN,
107
+ "nauc_precision_at_1_max": 0.011189,
108
+ "nauc_precision_at_1_std": 0.113973,
109
+ "nauc_precision_at_1_diff1": 0.031937,
110
+ "nauc_precision_at_3_max": 0.045653,
111
+ "nauc_precision_at_3_std": 0.099416,
112
+ "nauc_precision_at_3_diff1": -0.010982,
113
+ "nauc_precision_at_5_max": 0.059473,
114
+ "nauc_precision_at_5_std": 0.112497,
115
+ "nauc_precision_at_5_diff1": -0.043335,
116
+ "nauc_precision_at_10_max": 0.14866,
117
+ "nauc_precision_at_10_std": 0.079641,
118
+ "nauc_precision_at_10_diff1": -0.070542,
119
+ "nauc_precision_at_20_max": 0.172396,
120
+ "nauc_precision_at_20_std": 0.061313,
121
+ "nauc_precision_at_20_diff1": -0.066816,
122
+ "nauc_precision_at_100_max": 0.172396,
123
+ "nauc_precision_at_100_std": 0.061313,
124
+ "nauc_precision_at_100_diff1": -0.066816,
125
+ "nauc_precision_at_1000_max": 0.172396,
126
+ "nauc_precision_at_1000_std": 0.061313,
127
+ "nauc_precision_at_1000_diff1": -0.066816,
128
+ "nauc_mrr_at_1_max": 0.011189,
129
+ "nauc_mrr_at_1_std": 0.113973,
130
+ "nauc_mrr_at_1_diff1": 0.031937,
131
+ "nauc_mrr_at_3_max": 0.04895,
132
+ "nauc_mrr_at_3_std": 0.107965,
133
+ "nauc_mrr_at_3_diff1": 0.040056,
134
+ "nauc_mrr_at_5_max": 0.041345,
135
+ "nauc_mrr_at_5_std": 0.115595,
136
+ "nauc_mrr_at_5_diff1": 0.038347,
137
+ "nauc_mrr_at_10_max": 0.038473,
138
+ "nauc_mrr_at_10_std": 0.112927,
139
+ "nauc_mrr_at_10_diff1": 0.046707,
140
+ "nauc_mrr_at_20_max": 0.039162,
141
+ "nauc_mrr_at_20_std": 0.113984,
142
+ "nauc_mrr_at_20_diff1": 0.043099,
143
+ "nauc_mrr_at_100_max": 0.039162,
144
+ "nauc_mrr_at_100_std": 0.113984,
145
+ "nauc_mrr_at_100_diff1": 0.043099,
146
+ "nauc_mrr_at_1000_max": 0.039162,
147
+ "nauc_mrr_at_1000_std": 0.113984,
148
+ "nauc_mrr_at_1000_diff1": 0.043099,
149
+ "hit_rate_at_1": 0.48199,
150
+ "hit_rate_at_3": 0.74515,
151
+ "hit_rate_at_5": 0.8615,
152
+ "hit_rate_at_10": 0.94737,
153
+ "hit_rate_at_20": 1.0,
154
+ "hit_rate_at_100": 1.0,
155
+ "hit_rate_at_1000": 1.0,
156
+ "main_score": 0.52129,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 8.538385152816772,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/BIOSSES.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "d3fb88f8f02e40887cd149695127462bbcf29b4a",
3
+ "task_name": "BIOSSES",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "pearson": 0.816215,
9
+ "spearman": 0.800018,
10
+ "cosine_pearson": 0.816215,
11
+ "cosine_spearman": 0.800018,
12
+ "manhattan_pearson": 0.805439,
13
+ "manhattan_spearman": 0.803434,
14
+ "euclidean_pearson": 0.804642,
15
+ "euclidean_spearman": 0.800018,
16
+ "main_score": 0.800018,
17
+ "hf_subset": "default",
18
+ "languages": [
19
+ "eng-Latn"
20
+ ]
21
+ }
22
+ ]
23
+ },
24
+ "evaluation_time": 0.25973010063171387,
25
+ "kg_co2_emissions": null,
26
+ "date": null
27
+ }
results/Banking77Classification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "0fd18e25b25c072e09e0d92ab615fda904d66300",
3
+ "task_name": "Banking77Classification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.730519,
11
+ "f1": 0.721853,
12
+ "f1_weighted": 0.721853,
13
+ "precision": 0.750012,
14
+ "precision_weighted": 0.750012,
15
+ "recall": 0.730519,
16
+ "recall_weighted": 0.730519,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.736039,
22
+ "f1": 0.726787,
23
+ "f1_weighted": 0.726787,
24
+ "precision": 0.759163,
25
+ "precision_weighted": 0.759163,
26
+ "recall": 0.736039,
27
+ "recall_weighted": 0.736039,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.737338,
33
+ "f1": 0.729163,
34
+ "f1_weighted": 0.729163,
35
+ "precision": 0.760371,
36
+ "precision_weighted": 0.760371,
37
+ "recall": 0.737338,
38
+ "recall_weighted": 0.737338,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.735065,
44
+ "f1": 0.728565,
45
+ "f1_weighted": 0.728565,
46
+ "precision": 0.757244,
47
+ "precision_weighted": 0.757244,
48
+ "recall": 0.735065,
49
+ "recall_weighted": 0.735065,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.723701,
55
+ "f1": 0.715221,
56
+ "f1_weighted": 0.715221,
57
+ "precision": 0.744897,
58
+ "precision_weighted": 0.744897,
59
+ "recall": 0.723701,
60
+ "recall_weighted": 0.723701,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.729221,
66
+ "f1": 0.725092,
67
+ "f1_weighted": 0.725092,
68
+ "precision": 0.751563,
69
+ "precision_weighted": 0.751563,
70
+ "recall": 0.729221,
71
+ "recall_weighted": 0.729221,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.721104,
77
+ "f1": 0.71354,
78
+ "f1_weighted": 0.71354,
79
+ "precision": 0.735443,
80
+ "precision_weighted": 0.735443,
81
+ "recall": 0.721104,
82
+ "recall_weighted": 0.721104,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.728571,
88
+ "f1": 0.719806,
89
+ "f1_weighted": 0.719806,
90
+ "precision": 0.746487,
91
+ "precision_weighted": 0.746487,
92
+ "recall": 0.728571,
93
+ "recall_weighted": 0.728571,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.724351,
99
+ "f1": 0.715261,
100
+ "f1_weighted": 0.715261,
101
+ "precision": 0.753678,
102
+ "precision_weighted": 0.753678,
103
+ "recall": 0.724351,
104
+ "recall_weighted": 0.724351,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.726948,
110
+ "f1": 0.718588,
111
+ "f1_weighted": 0.718588,
112
+ "precision": 0.745881,
113
+ "precision_weighted": 0.745881,
114
+ "recall": 0.726948,
115
+ "recall_weighted": 0.726948,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.729286,
121
+ "f1": 0.721388,
122
+ "f1_weighted": 0.721388,
123
+ "precision": 0.750474,
124
+ "precision_weighted": 0.750474,
125
+ "recall": 0.729286,
126
+ "recall_weighted": 0.729286,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.729286,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 49.54488945007324,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/BiorxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65b79d1d13f80053f67aca9498d9402c2d9f1f40",
3
+ "task_name": "BiorxivClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.301077,
9
+ "v_measure_std": 0.011567,
10
+ "v_measures": [
11
+ 0.30247,
12
+ 0.302035,
13
+ 0.307166,
14
+ 0.29129,
15
+ 0.298936,
16
+ 0.312408,
17
+ 0.275809,
18
+ 0.312749,
19
+ 0.292164,
20
+ 0.315744
21
+ ],
22
+ "main_score": 0.301077,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 94.12546920776367,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/BiorxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "258694dd0231531bc1fd9de6ceb52a0853c6d908",
3
+ "task_name": "BiorxivClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.2036,
9
+ "v_measure_std": 0.007179,
10
+ "v_measures": [
11
+ 0.199789,
12
+ 0.204072,
13
+ 0.199858,
14
+ 0.19545,
15
+ 0.190448,
16
+ 0.208943,
17
+ 0.210999,
18
+ 0.214226,
19
+ 0.201717,
20
+ 0.210495
21
+ ],
22
+ "main_score": 0.2036,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 77.59684014320374,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/CQADupstackAndroidRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "9be4c0e46342e8e3aff577a89b9a1ec9bc6b4af3",
3
+ "task_name": "CQADupstackAndroidRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.24177,
9
+ "ndcg_at_3": 0.27007,
10
+ "ndcg_at_5": 0.29229,
11
+ "ndcg_at_10": 0.31749,
12
+ "ndcg_at_20": 0.33556,
13
+ "ndcg_at_100": 0.36885,
14
+ "ndcg_at_1000": 0.39898,
15
+ "map_at_1": 0.19633,
16
+ "map_at_3": 0.23985,
17
+ "map_at_5": 0.2557,
18
+ "map_at_10": 0.2678,
19
+ "map_at_20": 0.27381,
20
+ "map_at_100": 0.27926,
21
+ "map_at_1000": 0.28062,
22
+ "recall_at_1": 0.19633,
23
+ "recall_at_3": 0.28687,
24
+ "recall_at_5": 0.34486,
25
+ "recall_at_10": 0.42342,
26
+ "recall_at_20": 0.4882,
27
+ "recall_at_100": 0.64633,
28
+ "recall_at_1000": 0.85253,
29
+ "accuracy": 0.19633,
30
+ "precision_at_1": 0.24177,
31
+ "precision_at_3": 0.12971,
32
+ "precision_at_5": 0.097,
33
+ "precision_at_10": 0.06109,
34
+ "precision_at_20": 0.03691,
35
+ "precision_at_100": 0.0107,
36
+ "precision_at_1000": 0.0016,
37
+ "mrr_at_1": 0.241774,
38
+ "mrr_at_3": 0.285646,
39
+ "mrr_at_5": 0.299237,
40
+ "mrr_at_10": 0.311264,
41
+ "mrr_at_20": 0.316313,
42
+ "mrr_at_100": 0.320365,
43
+ "mrr_at_1000": 0.321067,
44
+ "nauc_ndcg_at_1_max": 0.195839,
45
+ "nauc_ndcg_at_1_std": -0.053038,
46
+ "nauc_ndcg_at_1_diff1": 0.44851,
47
+ "nauc_ndcg_at_3_max": 0.199817,
48
+ "nauc_ndcg_at_3_std": -0.044398,
49
+ "nauc_ndcg_at_3_diff1": 0.419928,
50
+ "nauc_ndcg_at_5_max": 0.200822,
51
+ "nauc_ndcg_at_5_std": -0.056291,
52
+ "nauc_ndcg_at_5_diff1": 0.418192,
53
+ "nauc_ndcg_at_10_max": 0.200914,
54
+ "nauc_ndcg_at_10_std": -0.037602,
55
+ "nauc_ndcg_at_10_diff1": 0.408634,
56
+ "nauc_ndcg_at_20_max": 0.20641,
57
+ "nauc_ndcg_at_20_std": -0.026072,
58
+ "nauc_ndcg_at_20_diff1": 0.408286,
59
+ "nauc_ndcg_at_100_max": 0.225324,
60
+ "nauc_ndcg_at_100_std": -0.006199,
61
+ "nauc_ndcg_at_100_diff1": 0.410283,
62
+ "nauc_ndcg_at_1000_max": 0.223946,
63
+ "nauc_ndcg_at_1000_std": -0.004229,
64
+ "nauc_ndcg_at_1000_diff1": 0.414098,
65
+ "nauc_map_at_1_max": 0.188354,
66
+ "nauc_map_at_1_std": -0.044449,
67
+ "nauc_map_at_1_diff1": 0.471441,
68
+ "nauc_map_at_3_max": 0.204356,
69
+ "nauc_map_at_3_std": -0.045069,
70
+ "nauc_map_at_3_diff1": 0.443895,
71
+ "nauc_map_at_5_max": 0.206575,
72
+ "nauc_map_at_5_std": -0.050182,
73
+ "nauc_map_at_5_diff1": 0.440774,
74
+ "nauc_map_at_10_max": 0.206471,
75
+ "nauc_map_at_10_std": -0.04273,
76
+ "nauc_map_at_10_diff1": 0.434219,
77
+ "nauc_map_at_20_max": 0.208793,
78
+ "nauc_map_at_20_std": -0.040031,
79
+ "nauc_map_at_20_diff1": 0.433721,
80
+ "nauc_map_at_100_max": 0.210727,
81
+ "nauc_map_at_100_std": -0.037665,
82
+ "nauc_map_at_100_diff1": 0.432985,
83
+ "nauc_map_at_1000_max": 0.210536,
84
+ "nauc_map_at_1000_std": -0.037284,
85
+ "nauc_map_at_1000_diff1": 0.432972,
86
+ "nauc_recall_at_1_max": 0.188354,
87
+ "nauc_recall_at_1_std": -0.044449,
88
+ "nauc_recall_at_1_diff1": 0.471441,
89
+ "nauc_recall_at_3_max": 0.197539,
90
+ "nauc_recall_at_3_std": -0.03045,
91
+ "nauc_recall_at_3_diff1": 0.401321,
92
+ "nauc_recall_at_5_max": 0.197823,
93
+ "nauc_recall_at_5_std": -0.05243,
94
+ "nauc_recall_at_5_diff1": 0.38462,
95
+ "nauc_recall_at_10_max": 0.189637,
96
+ "nauc_recall_at_10_std": 0.004294,
97
+ "nauc_recall_at_10_diff1": 0.343022,
98
+ "nauc_recall_at_20_max": 0.201524,
99
+ "nauc_recall_at_20_std": 0.043873,
100
+ "nauc_recall_at_20_diff1": 0.336553,
101
+ "nauc_recall_at_100_max": 0.300518,
102
+ "nauc_recall_at_100_std": 0.17212,
103
+ "nauc_recall_at_100_diff1": 0.338388,
104
+ "nauc_recall_at_1000_max": 0.398995,
105
+ "nauc_recall_at_1000_std": 0.389981,
106
+ "nauc_recall_at_1000_diff1": 0.374695,
107
+ "nauc_precision_at_1_max": 0.195839,
108
+ "nauc_precision_at_1_std": -0.053038,
109
+ "nauc_precision_at_1_diff1": 0.44851,
110
+ "nauc_precision_at_3_max": 0.228784,
111
+ "nauc_precision_at_3_std": -0.045029,
112
+ "nauc_precision_at_3_diff1": 0.341057,
113
+ "nauc_precision_at_5_max": 0.218662,
114
+ "nauc_precision_at_5_std": -0.064408,
115
+ "nauc_precision_at_5_diff1": 0.302519,
116
+ "nauc_precision_at_10_max": 0.184561,
117
+ "nauc_precision_at_10_std": -0.041402,
118
+ "nauc_precision_at_10_diff1": 0.247017,
119
+ "nauc_precision_at_20_max": 0.155689,
120
+ "nauc_precision_at_20_std": -0.022984,
121
+ "nauc_precision_at_20_diff1": 0.203272,
122
+ "nauc_precision_at_100_max": 0.130254,
123
+ "nauc_precision_at_100_std": 0.009003,
124
+ "nauc_precision_at_100_diff1": 0.108801,
125
+ "nauc_precision_at_1000_max": -0.027462,
126
+ "nauc_precision_at_1000_std": -0.045777,
127
+ "nauc_precision_at_1000_diff1": -0.006891,
128
+ "nauc_mrr_at_1_max": 0.195839,
129
+ "nauc_mrr_at_1_std": -0.053038,
130
+ "nauc_mrr_at_1_diff1": 0.44851,
131
+ "nauc_mrr_at_3_max": 0.188248,
132
+ "nauc_mrr_at_3_std": -0.052723,
133
+ "nauc_mrr_at_3_diff1": 0.412726,
134
+ "nauc_mrr_at_5_max": 0.189203,
135
+ "nauc_mrr_at_5_std": -0.062892,
136
+ "nauc_mrr_at_5_diff1": 0.409278,
137
+ "nauc_mrr_at_10_max": 0.187292,
138
+ "nauc_mrr_at_10_std": -0.057095,
139
+ "nauc_mrr_at_10_diff1": 0.405154,
140
+ "nauc_mrr_at_20_max": 0.188573,
141
+ "nauc_mrr_at_20_std": -0.053584,
142
+ "nauc_mrr_at_20_diff1": 0.405267,
143
+ "nauc_mrr_at_100_max": 0.19088,
144
+ "nauc_mrr_at_100_std": -0.050981,
145
+ "nauc_mrr_at_100_diff1": 0.405688,
146
+ "nauc_mrr_at_1000_max": 0.190851,
147
+ "nauc_mrr_at_1000_std": -0.050818,
148
+ "nauc_mrr_at_1000_diff1": 0.405775,
149
+ "hit_rate_at_1": 0.24177,
150
+ "hit_rate_at_3": 0.34621,
151
+ "hit_rate_at_5": 0.40629,
152
+ "hit_rate_at_10": 0.49785,
153
+ "hit_rate_at_20": 0.56795,
154
+ "hit_rate_at_100": 0.72675,
155
+ "hit_rate_at_1000": 0.89986,
156
+ "main_score": 0.31749,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 27.703421354293823,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackEnglishRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ad9991cb51e31e31e430383c75ffb2885547b5f0",
3
+ "task_name": "CQADupstackEnglishRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.17197,
9
+ "ndcg_at_3": 0.18533,
10
+ "ndcg_at_5": 0.19793,
11
+ "ndcg_at_10": 0.21105,
12
+ "ndcg_at_20": 0.22484,
13
+ "ndcg_at_100": 0.25245,
14
+ "ndcg_at_1000": 0.2824,
15
+ "map_at_1": 0.1352,
16
+ "map_at_3": 0.16356,
17
+ "map_at_5": 0.17246,
18
+ "map_at_10": 0.17919,
19
+ "map_at_20": 0.18368,
20
+ "map_at_100": 0.18797,
21
+ "map_at_1000": 0.1892,
22
+ "recall_at_1": 0.1352,
23
+ "recall_at_3": 0.19181,
24
+ "recall_at_5": 0.22535,
25
+ "recall_at_10": 0.26705,
26
+ "recall_at_20": 0.31788,
27
+ "recall_at_100": 0.45274,
28
+ "recall_at_1000": 0.65913,
29
+ "accuracy": 0.1352,
30
+ "precision_at_1": 0.17197,
31
+ "precision_at_3": 0.08875,
32
+ "precision_at_5": 0.06484,
33
+ "precision_at_10": 0.04025,
34
+ "precision_at_20": 0.02529,
35
+ "precision_at_100": 0.00773,
36
+ "precision_at_1000": 0.0013,
37
+ "mrr_at_1": 0.171975,
38
+ "mrr_at_3": 0.202442,
39
+ "mrr_at_5": 0.21104,
40
+ "mrr_at_10": 0.217702,
41
+ "mrr_at_20": 0.221492,
42
+ "mrr_at_100": 0.225171,
43
+ "mrr_at_1000": 0.225945,
44
+ "nauc_ndcg_at_1_max": 0.220457,
45
+ "nauc_ndcg_at_1_std": 0.038087,
46
+ "nauc_ndcg_at_1_diff1": 0.518149,
47
+ "nauc_ndcg_at_3_max": 0.193045,
48
+ "nauc_ndcg_at_3_std": 0.040276,
49
+ "nauc_ndcg_at_3_diff1": 0.463651,
50
+ "nauc_ndcg_at_5_max": 0.183746,
51
+ "nauc_ndcg_at_5_std": 0.04454,
52
+ "nauc_ndcg_at_5_diff1": 0.449786,
53
+ "nauc_ndcg_at_10_max": 0.174879,
54
+ "nauc_ndcg_at_10_std": 0.054758,
55
+ "nauc_ndcg_at_10_diff1": 0.437006,
56
+ "nauc_ndcg_at_20_max": 0.173972,
57
+ "nauc_ndcg_at_20_std": 0.060192,
58
+ "nauc_ndcg_at_20_diff1": 0.431274,
59
+ "nauc_ndcg_at_100_max": 0.173972,
60
+ "nauc_ndcg_at_100_std": 0.073803,
61
+ "nauc_ndcg_at_100_diff1": 0.422346,
62
+ "nauc_ndcg_at_1000_max": 0.175449,
63
+ "nauc_ndcg_at_1000_std": 0.09006,
64
+ "nauc_ndcg_at_1000_diff1": 0.417797,
65
+ "nauc_map_at_1_max": 0.206365,
66
+ "nauc_map_at_1_std": 0.007875,
67
+ "nauc_map_at_1_diff1": 0.574731,
68
+ "nauc_map_at_3_max": 0.191027,
69
+ "nauc_map_at_3_std": 0.019023,
70
+ "nauc_map_at_3_diff1": 0.496357,
71
+ "nauc_map_at_5_max": 0.185638,
72
+ "nauc_map_at_5_std": 0.02535,
73
+ "nauc_map_at_5_diff1": 0.485762,
74
+ "nauc_map_at_10_max": 0.182315,
75
+ "nauc_map_at_10_std": 0.033462,
76
+ "nauc_map_at_10_diff1": 0.478341,
77
+ "nauc_map_at_20_max": 0.182969,
78
+ "nauc_map_at_20_std": 0.037814,
79
+ "nauc_map_at_20_diff1": 0.474898,
80
+ "nauc_map_at_100_max": 0.183873,
81
+ "nauc_map_at_100_std": 0.041603,
82
+ "nauc_map_at_100_diff1": 0.4725,
83
+ "nauc_map_at_1000_max": 0.183905,
84
+ "nauc_map_at_1000_std": 0.042787,
85
+ "nauc_map_at_1000_diff1": 0.47215,
86
+ "nauc_recall_at_1_max": 0.206365,
87
+ "nauc_recall_at_1_std": 0.007875,
88
+ "nauc_recall_at_1_diff1": 0.574731,
89
+ "nauc_recall_at_3_max": 0.169907,
90
+ "nauc_recall_at_3_std": 0.029821,
91
+ "nauc_recall_at_3_diff1": 0.428615,
92
+ "nauc_recall_at_5_max": 0.149712,
93
+ "nauc_recall_at_5_std": 0.045404,
94
+ "nauc_recall_at_5_diff1": 0.389461,
95
+ "nauc_recall_at_10_max": 0.134145,
96
+ "nauc_recall_at_10_std": 0.06875,
97
+ "nauc_recall_at_10_diff1": 0.35561,
98
+ "nauc_recall_at_20_max": 0.13486,
99
+ "nauc_recall_at_20_std": 0.085326,
100
+ "nauc_recall_at_20_diff1": 0.333335,
101
+ "nauc_recall_at_100_max": 0.126145,
102
+ "nauc_recall_at_100_std": 0.129497,
103
+ "nauc_recall_at_100_diff1": 0.289066,
104
+ "nauc_recall_at_1000_max": 0.126224,
105
+ "nauc_recall_at_1000_std": 0.238126,
106
+ "nauc_recall_at_1000_diff1": 0.23495,
107
+ "nauc_precision_at_1_max": 0.220457,
108
+ "nauc_precision_at_1_std": 0.038087,
109
+ "nauc_precision_at_1_diff1": 0.518149,
110
+ "nauc_precision_at_3_max": 0.189457,
111
+ "nauc_precision_at_3_std": 0.083035,
112
+ "nauc_precision_at_3_diff1": 0.359567,
113
+ "nauc_precision_at_5_max": 0.170769,
114
+ "nauc_precision_at_5_std": 0.104967,
115
+ "nauc_precision_at_5_diff1": 0.306482,
116
+ "nauc_precision_at_10_max": 0.156906,
117
+ "nauc_precision_at_10_std": 0.154544,
118
+ "nauc_precision_at_10_diff1": 0.255606,
119
+ "nauc_precision_at_20_max": 0.149875,
120
+ "nauc_precision_at_20_std": 0.186361,
121
+ "nauc_precision_at_20_diff1": 0.204917,
122
+ "nauc_precision_at_100_max": 0.131163,
123
+ "nauc_precision_at_100_std": 0.228751,
124
+ "nauc_precision_at_100_diff1": 0.112776,
125
+ "nauc_precision_at_1000_max": 0.090683,
126
+ "nauc_precision_at_1000_std": 0.24169,
127
+ "nauc_precision_at_1000_diff1": -0.002154,
128
+ "nauc_mrr_at_1_max": 0.220457,
129
+ "nauc_mrr_at_1_std": 0.038087,
130
+ "nauc_mrr_at_1_diff1": 0.518149,
131
+ "nauc_mrr_at_3_max": 0.19921,
132
+ "nauc_mrr_at_3_std": 0.050134,
133
+ "nauc_mrr_at_3_diff1": 0.459729,
134
+ "nauc_mrr_at_5_max": 0.195815,
135
+ "nauc_mrr_at_5_std": 0.055181,
136
+ "nauc_mrr_at_5_diff1": 0.451334,
137
+ "nauc_mrr_at_10_max": 0.192805,
138
+ "nauc_mrr_at_10_std": 0.059829,
139
+ "nauc_mrr_at_10_diff1": 0.445685,
140
+ "nauc_mrr_at_20_max": 0.191783,
141
+ "nauc_mrr_at_20_std": 0.059979,
142
+ "nauc_mrr_at_20_diff1": 0.443809,
143
+ "nauc_mrr_at_100_max": 0.191955,
144
+ "nauc_mrr_at_100_std": 0.060764,
145
+ "nauc_mrr_at_100_diff1": 0.442767,
146
+ "nauc_mrr_at_1000_max": 0.191959,
147
+ "nauc_mrr_at_1000_std": 0.060952,
148
+ "nauc_mrr_at_1000_diff1": 0.442842,
149
+ "hit_rate_at_1": 0.17197,
150
+ "hit_rate_at_3": 0.24204,
151
+ "hit_rate_at_5": 0.27962,
152
+ "hit_rate_at_10": 0.32994,
153
+ "hit_rate_at_20": 0.38535,
154
+ "hit_rate_at_100": 0.53503,
155
+ "hit_rate_at_1000": 0.73567,
156
+ "main_score": 0.21105,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 47.88984394073486,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGamingRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4885aa143210c98657558c04aaf3dc47cfb54340",
3
+ "task_name": "CQADupstackGamingRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.3047,
9
+ "ndcg_at_3": 0.36112,
10
+ "ndcg_at_5": 0.38151,
11
+ "ndcg_at_10": 0.40438,
12
+ "ndcg_at_20": 0.42239,
13
+ "ndcg_at_100": 0.45077,
14
+ "ndcg_at_1000": 0.47212,
15
+ "map_at_1": 0.26423,
16
+ "map_at_3": 0.33189,
17
+ "map_at_5": 0.34563,
18
+ "map_at_10": 0.35625,
19
+ "map_at_20": 0.36196,
20
+ "map_at_100": 0.36645,
21
+ "map_at_1000": 0.36732,
22
+ "recall_at_1": 0.26423,
23
+ "recall_at_3": 0.39973,
24
+ "recall_at_5": 0.45062,
25
+ "recall_at_10": 0.51831,
26
+ "recall_at_20": 0.58551,
27
+ "recall_at_100": 0.72608,
28
+ "recall_at_1000": 0.88322,
29
+ "accuracy": 0.26423,
30
+ "precision_at_1": 0.3047,
31
+ "precision_at_3": 0.16301,
32
+ "precision_at_5": 0.11235,
33
+ "precision_at_10": 0.06627,
34
+ "precision_at_20": 0.03828,
35
+ "precision_at_100": 0.00984,
36
+ "precision_at_1000": 0.00123,
37
+ "mrr_at_1": 0.304702,
38
+ "mrr_at_3": 0.364472,
39
+ "mrr_at_5": 0.376729,
40
+ "mrr_at_10": 0.386286,
41
+ "mrr_at_20": 0.390946,
42
+ "mrr_at_100": 0.394341,
43
+ "mrr_at_1000": 0.394917,
44
+ "nauc_ndcg_at_1_max": 0.319435,
45
+ "nauc_ndcg_at_1_std": -0.078732,
46
+ "nauc_ndcg_at_1_diff1": 0.512685,
47
+ "nauc_ndcg_at_3_max": 0.286542,
48
+ "nauc_ndcg_at_3_std": -0.091507,
49
+ "nauc_ndcg_at_3_diff1": 0.456919,
50
+ "nauc_ndcg_at_5_max": 0.29801,
51
+ "nauc_ndcg_at_5_std": -0.074471,
52
+ "nauc_ndcg_at_5_diff1": 0.452081,
53
+ "nauc_ndcg_at_10_max": 0.301654,
54
+ "nauc_ndcg_at_10_std": -0.064363,
55
+ "nauc_ndcg_at_10_diff1": 0.4426,
56
+ "nauc_ndcg_at_20_max": 0.305582,
57
+ "nauc_ndcg_at_20_std": -0.057005,
58
+ "nauc_ndcg_at_20_diff1": 0.434573,
59
+ "nauc_ndcg_at_100_max": 0.312647,
60
+ "nauc_ndcg_at_100_std": -0.045026,
61
+ "nauc_ndcg_at_100_diff1": 0.440831,
62
+ "nauc_ndcg_at_1000_max": 0.317541,
63
+ "nauc_ndcg_at_1000_std": -0.043625,
64
+ "nauc_ndcg_at_1000_diff1": 0.439761,
65
+ "nauc_map_at_1_max": 0.276497,
66
+ "nauc_map_at_1_std": -0.095328,
67
+ "nauc_map_at_1_diff1": 0.500581,
68
+ "nauc_map_at_3_max": 0.280819,
69
+ "nauc_map_at_3_std": -0.101048,
70
+ "nauc_map_at_3_diff1": 0.467535,
71
+ "nauc_map_at_5_max": 0.290364,
72
+ "nauc_map_at_5_std": -0.089303,
73
+ "nauc_map_at_5_diff1": 0.465347,
74
+ "nauc_map_at_10_max": 0.293302,
75
+ "nauc_map_at_10_std": -0.083786,
76
+ "nauc_map_at_10_diff1": 0.462328,
77
+ "nauc_map_at_20_max": 0.29579,
78
+ "nauc_map_at_20_std": -0.080863,
79
+ "nauc_map_at_20_diff1": 0.460825,
80
+ "nauc_map_at_100_max": 0.297578,
81
+ "nauc_map_at_100_std": -0.078615,
82
+ "nauc_map_at_100_diff1": 0.461628,
83
+ "nauc_map_at_1000_max": 0.297962,
84
+ "nauc_map_at_1000_std": -0.078341,
85
+ "nauc_map_at_1000_diff1": 0.461635,
86
+ "nauc_recall_at_1_max": 0.276497,
87
+ "nauc_recall_at_1_std": -0.095328,
88
+ "nauc_recall_at_1_diff1": 0.500581,
89
+ "nauc_recall_at_3_max": 0.253867,
90
+ "nauc_recall_at_3_std": -0.101424,
91
+ "nauc_recall_at_3_diff1": 0.410849,
92
+ "nauc_recall_at_5_max": 0.279274,
93
+ "nauc_recall_at_5_std": -0.05677,
94
+ "nauc_recall_at_5_diff1": 0.396746,
95
+ "nauc_recall_at_10_max": 0.28399,
96
+ "nauc_recall_at_10_std": -0.026635,
97
+ "nauc_recall_at_10_diff1": 0.364508,
98
+ "nauc_recall_at_20_max": 0.295206,
99
+ "nauc_recall_at_20_std": 0.005386,
100
+ "nauc_recall_at_20_diff1": 0.32925,
101
+ "nauc_recall_at_100_max": 0.32442,
102
+ "nauc_recall_at_100_std": 0.099746,
103
+ "nauc_recall_at_100_diff1": 0.338878,
104
+ "nauc_recall_at_1000_max": 0.404293,
105
+ "nauc_recall_at_1000_std": 0.250091,
106
+ "nauc_recall_at_1000_diff1": 0.241065,
107
+ "nauc_precision_at_1_max": 0.319435,
108
+ "nauc_precision_at_1_std": -0.078732,
109
+ "nauc_precision_at_1_diff1": 0.512685,
110
+ "nauc_precision_at_3_max": 0.29623,
111
+ "nauc_precision_at_3_std": -0.059651,
112
+ "nauc_precision_at_3_diff1": 0.375862,
113
+ "nauc_precision_at_5_max": 0.312694,
114
+ "nauc_precision_at_5_std": -0.005007,
115
+ "nauc_precision_at_5_diff1": 0.330812,
116
+ "nauc_precision_at_10_max": 0.315716,
117
+ "nauc_precision_at_10_std": 0.041868,
118
+ "nauc_precision_at_10_diff1": 0.276484,
119
+ "nauc_precision_at_20_max": 0.325902,
120
+ "nauc_precision_at_20_std": 0.088345,
121
+ "nauc_precision_at_20_diff1": 0.218158,
122
+ "nauc_precision_at_100_max": 0.325978,
123
+ "nauc_precision_at_100_std": 0.179416,
124
+ "nauc_precision_at_100_diff1": 0.131768,
125
+ "nauc_precision_at_1000_max": 0.297306,
126
+ "nauc_precision_at_1000_std": 0.207638,
127
+ "nauc_precision_at_1000_diff1": -0.001283,
128
+ "nauc_mrr_at_1_max": 0.319435,
129
+ "nauc_mrr_at_1_std": -0.078732,
130
+ "nauc_mrr_at_1_diff1": 0.512685,
131
+ "nauc_mrr_at_3_max": 0.308198,
132
+ "nauc_mrr_at_3_std": -0.078603,
133
+ "nauc_mrr_at_3_diff1": 0.469344,
134
+ "nauc_mrr_at_5_max": 0.312734,
135
+ "nauc_mrr_at_5_std": -0.068008,
136
+ "nauc_mrr_at_5_diff1": 0.464528,
137
+ "nauc_mrr_at_10_max": 0.314618,
138
+ "nauc_mrr_at_10_std": -0.063975,
139
+ "nauc_mrr_at_10_diff1": 0.460833,
140
+ "nauc_mrr_at_20_max": 0.314822,
141
+ "nauc_mrr_at_20_std": -0.063436,
142
+ "nauc_mrr_at_20_diff1": 0.45942,
143
+ "nauc_mrr_at_100_max": 0.315033,
144
+ "nauc_mrr_at_100_std": -0.062784,
145
+ "nauc_mrr_at_100_diff1": 0.460858,
146
+ "nauc_mrr_at_1000_max": 0.315132,
147
+ "nauc_mrr_at_1000_std": -0.062796,
148
+ "nauc_mrr_at_1000_diff1": 0.460884,
149
+ "hit_rate_at_1": 0.3047,
150
+ "hit_rate_at_3": 0.44075,
151
+ "hit_rate_at_5": 0.49342,
152
+ "hit_rate_at_10": 0.56614,
153
+ "hit_rate_at_20": 0.63386,
154
+ "hit_rate_at_100": 0.76865,
155
+ "hit_rate_at_1000": 0.90658,
156
+ "main_score": 0.40438,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 53.40445256233215,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackGisRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "5003b3064772da1887988e05400cf3806fe491f2",
3
+ "task_name": "CQADupstackGisRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.16836,
9
+ "ndcg_at_3": 0.20491,
10
+ "ndcg_at_5": 0.21863,
11
+ "ndcg_at_10": 0.23956,
12
+ "ndcg_at_20": 0.25598,
13
+ "ndcg_at_100": 0.28374,
14
+ "ndcg_at_1000": 0.31547,
15
+ "map_at_1": 0.15772,
16
+ "map_at_3": 0.19029,
17
+ "map_at_5": 0.19777,
18
+ "map_at_10": 0.20667,
19
+ "map_at_20": 0.21118,
20
+ "map_at_100": 0.21492,
21
+ "map_at_1000": 0.21599,
22
+ "recall_at_1": 0.15772,
23
+ "recall_at_3": 0.23359,
24
+ "recall_at_5": 0.26636,
25
+ "recall_at_10": 0.32845,
26
+ "recall_at_20": 0.39176,
27
+ "recall_at_100": 0.53868,
28
+ "recall_at_1000": 0.78711,
29
+ "accuracy": 0.15772,
30
+ "precision_at_1": 0.16836,
31
+ "precision_at_3": 0.08588,
32
+ "precision_at_5": 0.05989,
33
+ "precision_at_10": 0.03729,
34
+ "precision_at_20": 0.02243,
35
+ "precision_at_100": 0.00628,
36
+ "precision_at_1000": 0.00094,
37
+ "mrr_at_1": 0.168362,
38
+ "mrr_at_3": 0.203202,
39
+ "mrr_at_5": 0.211959,
40
+ "mrr_at_10": 0.22083,
41
+ "mrr_at_20": 0.225218,
42
+ "mrr_at_100": 0.228747,
43
+ "mrr_at_1000": 0.229627,
44
+ "nauc_ndcg_at_1_max": 0.237186,
45
+ "nauc_ndcg_at_1_std": -0.034501,
46
+ "nauc_ndcg_at_1_diff1": 0.526599,
47
+ "nauc_ndcg_at_3_max": 0.213458,
48
+ "nauc_ndcg_at_3_std": -0.023143,
49
+ "nauc_ndcg_at_3_diff1": 0.428695,
50
+ "nauc_ndcg_at_5_max": 0.205483,
51
+ "nauc_ndcg_at_5_std": -0.007325,
52
+ "nauc_ndcg_at_5_diff1": 0.413332,
53
+ "nauc_ndcg_at_10_max": 0.203522,
54
+ "nauc_ndcg_at_10_std": -0.002397,
55
+ "nauc_ndcg_at_10_diff1": 0.391896,
56
+ "nauc_ndcg_at_20_max": 0.202603,
57
+ "nauc_ndcg_at_20_std": 0.002221,
58
+ "nauc_ndcg_at_20_diff1": 0.382095,
59
+ "nauc_ndcg_at_100_max": 0.202961,
60
+ "nauc_ndcg_at_100_std": 0.008274,
61
+ "nauc_ndcg_at_100_diff1": 0.360958,
62
+ "nauc_ndcg_at_1000_max": 0.219827,
63
+ "nauc_ndcg_at_1000_std": 0.016354,
64
+ "nauc_ndcg_at_1000_diff1": 0.374841,
65
+ "nauc_map_at_1_max": 0.22275,
66
+ "nauc_map_at_1_std": -0.032642,
67
+ "nauc_map_at_1_diff1": 0.523053,
68
+ "nauc_map_at_3_max": 0.213835,
69
+ "nauc_map_at_3_std": -0.025158,
70
+ "nauc_map_at_3_diff1": 0.451424,
71
+ "nauc_map_at_5_max": 0.20864,
72
+ "nauc_map_at_5_std": -0.016588,
73
+ "nauc_map_at_5_diff1": 0.441741,
74
+ "nauc_map_at_10_max": 0.208712,
75
+ "nauc_map_at_10_std": -0.014602,
76
+ "nauc_map_at_10_diff1": 0.432534,
77
+ "nauc_map_at_20_max": 0.208808,
78
+ "nauc_map_at_20_std": -0.012968,
79
+ "nauc_map_at_20_diff1": 0.429572,
80
+ "nauc_map_at_100_max": 0.208895,
81
+ "nauc_map_at_100_std": -0.012008,
82
+ "nauc_map_at_100_diff1": 0.426128,
83
+ "nauc_map_at_1000_max": 0.209571,
84
+ "nauc_map_at_1000_std": -0.011885,
85
+ "nauc_map_at_1000_diff1": 0.426475,
86
+ "nauc_recall_at_1_max": 0.22275,
87
+ "nauc_recall_at_1_std": -0.032642,
88
+ "nauc_recall_at_1_diff1": 0.523053,
89
+ "nauc_recall_at_3_max": 0.19895,
90
+ "nauc_recall_at_3_std": -0.020923,
91
+ "nauc_recall_at_3_diff1": 0.366763,
92
+ "nauc_recall_at_5_max": 0.182263,
93
+ "nauc_recall_at_5_std": 0.009581,
94
+ "nauc_recall_at_5_diff1": 0.33788,
95
+ "nauc_recall_at_10_max": 0.172564,
96
+ "nauc_recall_at_10_std": 0.021195,
97
+ "nauc_recall_at_10_diff1": 0.282262,
98
+ "nauc_recall_at_20_max": 0.167015,
99
+ "nauc_recall_at_20_std": 0.036874,
100
+ "nauc_recall_at_20_diff1": 0.252602,
101
+ "nauc_recall_at_100_max": 0.159991,
102
+ "nauc_recall_at_100_std": 0.064774,
103
+ "nauc_recall_at_100_diff1": 0.144038,
104
+ "nauc_recall_at_1000_max": 0.311574,
105
+ "nauc_recall_at_1000_std": 0.210293,
106
+ "nauc_recall_at_1000_diff1": 0.17277,
107
+ "nauc_precision_at_1_max": 0.237186,
108
+ "nauc_precision_at_1_std": -0.034501,
109
+ "nauc_precision_at_1_diff1": 0.526599,
110
+ "nauc_precision_at_3_max": 0.216171,
111
+ "nauc_precision_at_3_std": -0.013772,
112
+ "nauc_precision_at_3_diff1": 0.370225,
113
+ "nauc_precision_at_5_max": 0.207439,
114
+ "nauc_precision_at_5_std": 0.033297,
115
+ "nauc_precision_at_5_diff1": 0.33532,
116
+ "nauc_precision_at_10_max": 0.202235,
117
+ "nauc_precision_at_10_std": 0.050711,
118
+ "nauc_precision_at_10_diff1": 0.281533,
119
+ "nauc_precision_at_20_max": 0.202108,
120
+ "nauc_precision_at_20_std": 0.057678,
121
+ "nauc_precision_at_20_diff1": 0.244305,
122
+ "nauc_precision_at_100_max": 0.188261,
123
+ "nauc_precision_at_100_std": 0.082876,
124
+ "nauc_precision_at_100_diff1": 0.124982,
125
+ "nauc_precision_at_1000_max": 0.238191,
126
+ "nauc_precision_at_1000_std": 0.133834,
127
+ "nauc_precision_at_1000_diff1": 0.070424,
128
+ "nauc_mrr_at_1_max": 0.237186,
129
+ "nauc_mrr_at_1_std": -0.034501,
130
+ "nauc_mrr_at_1_diff1": 0.526599,
131
+ "nauc_mrr_at_3_max": 0.222937,
132
+ "nauc_mrr_at_3_std": -0.020849,
133
+ "nauc_mrr_at_3_diff1": 0.447916,
134
+ "nauc_mrr_at_5_max": 0.221488,
135
+ "nauc_mrr_at_5_std": -0.011663,
136
+ "nauc_mrr_at_5_diff1": 0.439882,
137
+ "nauc_mrr_at_10_max": 0.220347,
138
+ "nauc_mrr_at_10_std": -0.010445,
139
+ "nauc_mrr_at_10_diff1": 0.429698,
140
+ "nauc_mrr_at_20_max": 0.219574,
141
+ "nauc_mrr_at_20_std": -0.009469,
142
+ "nauc_mrr_at_20_diff1": 0.426616,
143
+ "nauc_mrr_at_100_max": 0.219745,
144
+ "nauc_mrr_at_100_std": -0.00889,
145
+ "nauc_mrr_at_100_diff1": 0.42385,
146
+ "nauc_mrr_at_1000_max": 0.220191,
147
+ "nauc_mrr_at_1000_std": -0.008884,
148
+ "nauc_mrr_at_1000_diff1": 0.42433,
149
+ "hit_rate_at_1": 0.16836,
150
+ "hit_rate_at_3": 0.25198,
151
+ "hit_rate_at_5": 0.29153,
152
+ "hit_rate_at_10": 0.35706,
153
+ "hit_rate_at_20": 0.4226,
154
+ "hit_rate_at_100": 0.57401,
155
+ "hit_rate_at_1000": 0.81017,
156
+ "main_score": 0.23956,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 48.321799993515015,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackMathematicaRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "90fceea13679c63fe563ded68f3b6f06e50061de",
3
+ "task_name": "CQADupstackMathematicaRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.10572,
9
+ "ndcg_at_3": 0.1344,
10
+ "ndcg_at_5": 0.15294,
11
+ "ndcg_at_10": 0.16976,
12
+ "ndcg_at_20": 0.18796,
13
+ "ndcg_at_100": 0.22598,
14
+ "ndcg_at_1000": 0.25889,
15
+ "map_at_1": 0.08106,
16
+ "map_at_3": 0.11453,
17
+ "map_at_5": 0.12578,
18
+ "map_at_10": 0.13293,
19
+ "map_at_20": 0.13821,
20
+ "map_at_100": 0.14337,
21
+ "map_at_1000": 0.14464,
22
+ "recall_at_1": 0.08106,
23
+ "recall_at_3": 0.15677,
24
+ "recall_at_5": 0.20314,
25
+ "recall_at_10": 0.25361,
26
+ "recall_at_20": 0.31993,
27
+ "recall_at_100": 0.51174,
28
+ "recall_at_1000": 0.74859,
29
+ "accuracy": 0.08106,
30
+ "precision_at_1": 0.10572,
31
+ "precision_at_3": 0.06716,
32
+ "precision_at_5": 0.05274,
33
+ "precision_at_10": 0.03321,
34
+ "precision_at_20": 0.02139,
35
+ "precision_at_100": 0.00716,
36
+ "precision_at_1000": 0.00114,
37
+ "mrr_at_1": 0.105721,
38
+ "mrr_at_3": 0.145108,
39
+ "mrr_at_5": 0.157048,
40
+ "mrr_at_10": 0.165174,
41
+ "mrr_at_20": 0.170021,
42
+ "mrr_at_100": 0.175019,
43
+ "mrr_at_1000": 0.17589,
44
+ "nauc_ndcg_at_1_max": 0.035539,
45
+ "nauc_ndcg_at_1_std": -0.066001,
46
+ "nauc_ndcg_at_1_diff1": 0.346734,
47
+ "nauc_ndcg_at_3_max": 0.082649,
48
+ "nauc_ndcg_at_3_std": -0.012721,
49
+ "nauc_ndcg_at_3_diff1": 0.240017,
50
+ "nauc_ndcg_at_5_max": 0.066341,
51
+ "nauc_ndcg_at_5_std": -0.000653,
52
+ "nauc_ndcg_at_5_diff1": 0.212013,
53
+ "nauc_ndcg_at_10_max": 0.046896,
54
+ "nauc_ndcg_at_10_std": 0.013313,
55
+ "nauc_ndcg_at_10_diff1": 0.197846,
56
+ "nauc_ndcg_at_20_max": 0.041339,
57
+ "nauc_ndcg_at_20_std": 0.010412,
58
+ "nauc_ndcg_at_20_diff1": 0.209614,
59
+ "nauc_ndcg_at_100_max": 0.052779,
60
+ "nauc_ndcg_at_100_std": 0.035033,
61
+ "nauc_ndcg_at_100_diff1": 0.20516,
62
+ "nauc_ndcg_at_1000_max": 0.072392,
63
+ "nauc_ndcg_at_1000_std": 0.044832,
64
+ "nauc_ndcg_at_1000_diff1": 0.207827,
65
+ "nauc_map_at_1_max": 0.048607,
66
+ "nauc_map_at_1_std": -0.040466,
67
+ "nauc_map_at_1_diff1": 0.338968,
68
+ "nauc_map_at_3_max": 0.081374,
69
+ "nauc_map_at_3_std": -0.009001,
70
+ "nauc_map_at_3_diff1": 0.257624,
71
+ "nauc_map_at_5_max": 0.069913,
72
+ "nauc_map_at_5_std": -0.004754,
73
+ "nauc_map_at_5_diff1": 0.241293,
74
+ "nauc_map_at_10_max": 0.060959,
75
+ "nauc_map_at_10_std": 0.00085,
76
+ "nauc_map_at_10_diff1": 0.234341,
77
+ "nauc_map_at_20_max": 0.058979,
78
+ "nauc_map_at_20_std": 0.001188,
79
+ "nauc_map_at_20_diff1": 0.237876,
80
+ "nauc_map_at_100_max": 0.061316,
81
+ "nauc_map_at_100_std": 0.005836,
82
+ "nauc_map_at_100_diff1": 0.235731,
83
+ "nauc_map_at_1000_max": 0.061791,
84
+ "nauc_map_at_1000_std": 0.0063,
85
+ "nauc_map_at_1000_diff1": 0.236068,
86
+ "nauc_recall_at_1_max": 0.048607,
87
+ "nauc_recall_at_1_std": -0.040466,
88
+ "nauc_recall_at_1_diff1": 0.338968,
89
+ "nauc_recall_at_3_max": 0.099049,
90
+ "nauc_recall_at_3_std": 0.01302,
91
+ "nauc_recall_at_3_diff1": 0.18182,
92
+ "nauc_recall_at_5_max": 0.061892,
93
+ "nauc_recall_at_5_std": 0.023514,
94
+ "nauc_recall_at_5_diff1": 0.133834,
95
+ "nauc_recall_at_10_max": 0.01634,
96
+ "nauc_recall_at_10_std": 0.046696,
97
+ "nauc_recall_at_10_diff1": 0.105445,
98
+ "nauc_recall_at_20_max": 0.003255,
99
+ "nauc_recall_at_20_std": 0.03097,
100
+ "nauc_recall_at_20_diff1": 0.1421,
101
+ "nauc_recall_at_100_max": 0.041555,
102
+ "nauc_recall_at_100_std": 0.11143,
103
+ "nauc_recall_at_100_diff1": 0.130935,
104
+ "nauc_recall_at_1000_max": 0.201615,
105
+ "nauc_recall_at_1000_std": 0.236717,
106
+ "nauc_recall_at_1000_diff1": 0.123617,
107
+ "nauc_precision_at_1_max": 0.035539,
108
+ "nauc_precision_at_1_std": -0.066001,
109
+ "nauc_precision_at_1_diff1": 0.346734,
110
+ "nauc_precision_at_3_max": 0.070225,
111
+ "nauc_precision_at_3_std": -0.017361,
112
+ "nauc_precision_at_3_diff1": 0.189936,
113
+ "nauc_precision_at_5_max": 0.044309,
114
+ "nauc_precision_at_5_std": 0.00283,
115
+ "nauc_precision_at_5_diff1": 0.147291,
116
+ "nauc_precision_at_10_max": 0.019352,
117
+ "nauc_precision_at_10_std": 0.032872,
118
+ "nauc_precision_at_10_diff1": 0.13356,
119
+ "nauc_precision_at_20_max": 0.006863,
120
+ "nauc_precision_at_20_std": 0.031409,
121
+ "nauc_precision_at_20_diff1": 0.148412,
122
+ "nauc_precision_at_100_max": 0.03604,
123
+ "nauc_precision_at_100_std": 0.10396,
124
+ "nauc_precision_at_100_diff1": 0.109082,
125
+ "nauc_precision_at_1000_max": 0.06276,
126
+ "nauc_precision_at_1000_std": 0.07571,
127
+ "nauc_precision_at_1000_diff1": 0.027498,
128
+ "nauc_mrr_at_1_max": 0.035539,
129
+ "nauc_mrr_at_1_std": -0.066001,
130
+ "nauc_mrr_at_1_diff1": 0.346734,
131
+ "nauc_mrr_at_3_max": 0.06389,
132
+ "nauc_mrr_at_3_std": -0.033179,
133
+ "nauc_mrr_at_3_diff1": 0.267324,
134
+ "nauc_mrr_at_5_max": 0.058111,
135
+ "nauc_mrr_at_5_std": -0.026318,
136
+ "nauc_mrr_at_5_diff1": 0.251994,
137
+ "nauc_mrr_at_10_max": 0.049563,
138
+ "nauc_mrr_at_10_std": -0.021556,
139
+ "nauc_mrr_at_10_diff1": 0.243018,
140
+ "nauc_mrr_at_20_max": 0.048145,
141
+ "nauc_mrr_at_20_std": -0.022164,
142
+ "nauc_mrr_at_20_diff1": 0.246568,
143
+ "nauc_mrr_at_100_max": 0.048751,
144
+ "nauc_mrr_at_100_std": -0.019374,
145
+ "nauc_mrr_at_100_diff1": 0.245814,
146
+ "nauc_mrr_at_1000_max": 0.049219,
147
+ "nauc_mrr_at_1000_std": -0.019199,
148
+ "nauc_mrr_at_1000_diff1": 0.245871,
149
+ "hit_rate_at_1": 0.10572,
150
+ "hit_rate_at_3": 0.19652,
151
+ "hit_rate_at_5": 0.24876,
152
+ "hit_rate_at_10": 0.30846,
153
+ "hit_rate_at_20": 0.37811,
154
+ "hit_rate_at_100": 0.58955,
155
+ "hit_rate_at_1000": 0.80597,
156
+ "main_score": 0.16976,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 23.938961267471313,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackPhysicsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4",
3
+ "task_name": "CQADupstackPhysicsRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.22233,
9
+ "ndcg_at_3": 0.24965,
10
+ "ndcg_at_5": 0.268,
11
+ "ndcg_at_10": 0.29323,
12
+ "ndcg_at_20": 0.31578,
13
+ "ndcg_at_100": 0.35244,
14
+ "ndcg_at_1000": 0.38397,
15
+ "map_at_1": 0.18341,
16
+ "map_at_3": 0.22444,
17
+ "map_at_5": 0.23609,
18
+ "map_at_10": 0.2477,
19
+ "map_at_20": 0.25471,
20
+ "map_at_100": 0.26041,
21
+ "map_at_1000": 0.26184,
22
+ "recall_at_1": 0.18341,
23
+ "recall_at_3": 0.26973,
24
+ "recall_at_5": 0.3152,
25
+ "recall_at_10": 0.39051,
26
+ "recall_at_20": 0.47112,
27
+ "recall_at_100": 0.64664,
28
+ "recall_at_1000": 0.86364,
29
+ "accuracy": 0.18341,
30
+ "precision_at_1": 0.22233,
31
+ "precision_at_3": 0.11614,
32
+ "precision_at_5": 0.08412,
33
+ "precision_at_10": 0.05438,
34
+ "precision_at_20": 0.03402,
35
+ "precision_at_100": 0.01,
36
+ "precision_at_1000": 0.00147,
37
+ "mrr_at_1": 0.222329,
38
+ "mrr_at_3": 0.26548,
39
+ "mrr_at_5": 0.27804,
40
+ "mrr_at_10": 0.289703,
41
+ "mrr_at_20": 0.295476,
42
+ "mrr_at_100": 0.299849,
43
+ "mrr_at_1000": 0.300663,
44
+ "nauc_ndcg_at_1_max": 0.201276,
45
+ "nauc_ndcg_at_1_std": -0.029582,
46
+ "nauc_ndcg_at_1_diff1": 0.4991,
47
+ "nauc_ndcg_at_3_max": 0.179387,
48
+ "nauc_ndcg_at_3_std": -0.016129,
49
+ "nauc_ndcg_at_3_diff1": 0.472268,
50
+ "nauc_ndcg_at_5_max": 0.167982,
51
+ "nauc_ndcg_at_5_std": -0.026608,
52
+ "nauc_ndcg_at_5_diff1": 0.461672,
53
+ "nauc_ndcg_at_10_max": 0.165005,
54
+ "nauc_ndcg_at_10_std": -0.026401,
55
+ "nauc_ndcg_at_10_diff1": 0.452322,
56
+ "nauc_ndcg_at_20_max": 0.160991,
57
+ "nauc_ndcg_at_20_std": -0.018504,
58
+ "nauc_ndcg_at_20_diff1": 0.441476,
59
+ "nauc_ndcg_at_100_max": 0.165914,
60
+ "nauc_ndcg_at_100_std": -0.001247,
61
+ "nauc_ndcg_at_100_diff1": 0.434873,
62
+ "nauc_ndcg_at_1000_max": 0.179617,
63
+ "nauc_ndcg_at_1000_std": 0.010176,
64
+ "nauc_ndcg_at_1000_diff1": 0.441244,
65
+ "nauc_map_at_1_max": 0.166622,
66
+ "nauc_map_at_1_std": -0.0761,
67
+ "nauc_map_at_1_diff1": 0.537635,
68
+ "nauc_map_at_3_max": 0.16765,
69
+ "nauc_map_at_3_std": -0.039344,
70
+ "nauc_map_at_3_diff1": 0.495728,
71
+ "nauc_map_at_5_max": 0.163376,
72
+ "nauc_map_at_5_std": -0.04169,
73
+ "nauc_map_at_5_diff1": 0.487016,
74
+ "nauc_map_at_10_max": 0.165644,
75
+ "nauc_map_at_10_std": -0.038855,
76
+ "nauc_map_at_10_diff1": 0.47963,
77
+ "nauc_map_at_20_max": 0.165426,
78
+ "nauc_map_at_20_std": -0.036266,
79
+ "nauc_map_at_20_diff1": 0.475772,
80
+ "nauc_map_at_100_max": 0.166014,
81
+ "nauc_map_at_100_std": -0.032592,
82
+ "nauc_map_at_100_diff1": 0.474607,
83
+ "nauc_map_at_1000_max": 0.16687,
84
+ "nauc_map_at_1000_std": -0.031821,
85
+ "nauc_map_at_1000_diff1": 0.47487,
86
+ "nauc_recall_at_1_max": 0.166622,
87
+ "nauc_recall_at_1_std": -0.0761,
88
+ "nauc_recall_at_1_diff1": 0.537635,
89
+ "nauc_recall_at_3_max": 0.155142,
90
+ "nauc_recall_at_3_std": -0.014697,
91
+ "nauc_recall_at_3_diff1": 0.449945,
92
+ "nauc_recall_at_5_max": 0.140368,
93
+ "nauc_recall_at_5_std": -0.025487,
94
+ "nauc_recall_at_5_diff1": 0.416838,
95
+ "nauc_recall_at_10_max": 0.129362,
96
+ "nauc_recall_at_10_std": -0.01817,
97
+ "nauc_recall_at_10_diff1": 0.373792,
98
+ "nauc_recall_at_20_max": 0.109971,
99
+ "nauc_recall_at_20_std": 0.011553,
100
+ "nauc_recall_at_20_diff1": 0.331395,
101
+ "nauc_recall_at_100_max": 0.119725,
102
+ "nauc_recall_at_100_std": 0.091372,
103
+ "nauc_recall_at_100_diff1": 0.27125,
104
+ "nauc_recall_at_1000_max": 0.264439,
105
+ "nauc_recall_at_1000_std": 0.353581,
106
+ "nauc_recall_at_1000_diff1": 0.234865,
107
+ "nauc_precision_at_1_max": 0.201276,
108
+ "nauc_precision_at_1_std": -0.029582,
109
+ "nauc_precision_at_1_diff1": 0.4991,
110
+ "nauc_precision_at_3_max": 0.219021,
111
+ "nauc_precision_at_3_std": 0.062318,
112
+ "nauc_precision_at_3_diff1": 0.393597,
113
+ "nauc_precision_at_5_max": 0.190552,
114
+ "nauc_precision_at_5_std": 0.050071,
115
+ "nauc_precision_at_5_diff1": 0.333865,
116
+ "nauc_precision_at_10_max": 0.19514,
117
+ "nauc_precision_at_10_std": 0.063411,
118
+ "nauc_precision_at_10_diff1": 0.270428,
119
+ "nauc_precision_at_20_max": 0.165647,
120
+ "nauc_precision_at_20_std": 0.079055,
121
+ "nauc_precision_at_20_diff1": 0.19872,
122
+ "nauc_precision_at_100_max": 0.143269,
123
+ "nauc_precision_at_100_std": 0.135741,
124
+ "nauc_precision_at_100_diff1": 0.059613,
125
+ "nauc_precision_at_1000_max": 0.128309,
126
+ "nauc_precision_at_1000_std": 0.137437,
127
+ "nauc_precision_at_1000_diff1": -0.052306,
128
+ "nauc_mrr_at_1_max": 0.201276,
129
+ "nauc_mrr_at_1_std": -0.029582,
130
+ "nauc_mrr_at_1_diff1": 0.4991,
131
+ "nauc_mrr_at_3_max": 0.198438,
132
+ "nauc_mrr_at_3_std": -0.003653,
133
+ "nauc_mrr_at_3_diff1": 0.465709,
134
+ "nauc_mrr_at_5_max": 0.194702,
135
+ "nauc_mrr_at_5_std": -0.008157,
136
+ "nauc_mrr_at_5_diff1": 0.458291,
137
+ "nauc_mrr_at_10_max": 0.192266,
138
+ "nauc_mrr_at_10_std": -0.01016,
139
+ "nauc_mrr_at_10_diff1": 0.4562,
140
+ "nauc_mrr_at_20_max": 0.190214,
141
+ "nauc_mrr_at_20_std": -0.008684,
142
+ "nauc_mrr_at_20_diff1": 0.453594,
143
+ "nauc_mrr_at_100_max": 0.190642,
144
+ "nauc_mrr_at_100_std": -0.006897,
145
+ "nauc_mrr_at_100_diff1": 0.453532,
146
+ "nauc_mrr_at_1000_max": 0.190939,
147
+ "nauc_mrr_at_1000_std": -0.006472,
148
+ "nauc_mrr_at_1000_diff1": 0.453718,
149
+ "hit_rate_at_1": 0.22233,
150
+ "hit_rate_at_3": 0.32146,
151
+ "hit_rate_at_5": 0.37632,
152
+ "hit_rate_at_10": 0.46295,
153
+ "hit_rate_at_20": 0.5486,
154
+ "hit_rate_at_100": 0.71992,
155
+ "hit_rate_at_1000": 0.90375,
156
+ "main_score": 0.29323,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 46.436378955841064,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackProgrammersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6184bc1440d2dbc7612be22b50686b8826d22b32",
3
+ "task_name": "CQADupstackProgrammersRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.20091,
9
+ "ndcg_at_3": 0.23371,
10
+ "ndcg_at_5": 0.24978,
11
+ "ndcg_at_10": 0.2728,
12
+ "ndcg_at_20": 0.291,
13
+ "ndcg_at_100": 0.32582,
14
+ "ndcg_at_1000": 0.359,
15
+ "map_at_1": 0.16402,
16
+ "map_at_3": 0.20711,
17
+ "map_at_5": 0.21792,
18
+ "map_at_10": 0.22852,
19
+ "map_at_20": 0.234,
20
+ "map_at_100": 0.2393,
21
+ "map_at_1000": 0.24072,
22
+ "recall_at_1": 0.16402,
23
+ "recall_at_3": 0.26045,
24
+ "recall_at_5": 0.30106,
25
+ "recall_at_10": 0.36792,
26
+ "recall_at_20": 0.43212,
27
+ "recall_at_100": 0.60173,
28
+ "recall_at_1000": 0.83379,
29
+ "accuracy": 0.16402,
30
+ "precision_at_1": 0.20091,
31
+ "precision_at_3": 0.11035,
32
+ "precision_at_5": 0.079,
33
+ "precision_at_10": 0.05034,
34
+ "precision_at_20": 0.03065,
35
+ "precision_at_100": 0.0091,
36
+ "precision_at_1000": 0.00137,
37
+ "mrr_at_1": 0.200913,
38
+ "mrr_at_3": 0.248668,
39
+ "mrr_at_5": 0.260654,
40
+ "mrr_at_10": 0.269936,
41
+ "mrr_at_20": 0.274882,
42
+ "mrr_at_100": 0.279058,
43
+ "mrr_at_1000": 0.279974,
44
+ "nauc_ndcg_at_1_max": 0.249942,
45
+ "nauc_ndcg_at_1_std": 0.006952,
46
+ "nauc_ndcg_at_1_diff1": 0.453221,
47
+ "nauc_ndcg_at_3_max": 0.241441,
48
+ "nauc_ndcg_at_3_std": 0.026653,
49
+ "nauc_ndcg_at_3_diff1": 0.438629,
50
+ "nauc_ndcg_at_5_max": 0.256313,
51
+ "nauc_ndcg_at_5_std": 0.033406,
52
+ "nauc_ndcg_at_5_diff1": 0.431925,
53
+ "nauc_ndcg_at_10_max": 0.25869,
54
+ "nauc_ndcg_at_10_std": 0.048267,
55
+ "nauc_ndcg_at_10_diff1": 0.410788,
56
+ "nauc_ndcg_at_20_max": 0.259267,
57
+ "nauc_ndcg_at_20_std": 0.065314,
58
+ "nauc_ndcg_at_20_diff1": 0.399515,
59
+ "nauc_ndcg_at_100_max": 0.284826,
60
+ "nauc_ndcg_at_100_std": 0.091203,
61
+ "nauc_ndcg_at_100_diff1": 0.398443,
62
+ "nauc_ndcg_at_1000_max": 0.284396,
63
+ "nauc_ndcg_at_1000_std": 0.095076,
64
+ "nauc_ndcg_at_1000_diff1": 0.40235,
65
+ "nauc_map_at_1_max": 0.229025,
66
+ "nauc_map_at_1_std": -0.007523,
67
+ "nauc_map_at_1_diff1": 0.482887,
68
+ "nauc_map_at_3_max": 0.23603,
69
+ "nauc_map_at_3_std": 0.01759,
70
+ "nauc_map_at_3_diff1": 0.456063,
71
+ "nauc_map_at_5_max": 0.249484,
72
+ "nauc_map_at_5_std": 0.023747,
73
+ "nauc_map_at_5_diff1": 0.452145,
74
+ "nauc_map_at_10_max": 0.25236,
75
+ "nauc_map_at_10_std": 0.031687,
76
+ "nauc_map_at_10_diff1": 0.440764,
77
+ "nauc_map_at_20_max": 0.253089,
78
+ "nauc_map_at_20_std": 0.036746,
79
+ "nauc_map_at_20_diff1": 0.437358,
80
+ "nauc_map_at_100_max": 0.258283,
81
+ "nauc_map_at_100_std": 0.041129,
82
+ "nauc_map_at_100_diff1": 0.437083,
83
+ "nauc_map_at_1000_max": 0.258466,
84
+ "nauc_map_at_1000_std": 0.041458,
85
+ "nauc_map_at_1000_diff1": 0.437081,
86
+ "nauc_recall_at_1_max": 0.229025,
87
+ "nauc_recall_at_1_std": -0.007523,
88
+ "nauc_recall_at_1_diff1": 0.482887,
89
+ "nauc_recall_at_3_max": 0.225156,
90
+ "nauc_recall_at_3_std": 0.034623,
91
+ "nauc_recall_at_3_diff1": 0.408622,
92
+ "nauc_recall_at_5_max": 0.253339,
93
+ "nauc_recall_at_5_std": 0.049004,
94
+ "nauc_recall_at_5_diff1": 0.383184,
95
+ "nauc_recall_at_10_max": 0.247076,
96
+ "nauc_recall_at_10_std": 0.083886,
97
+ "nauc_recall_at_10_diff1": 0.323209,
98
+ "nauc_recall_at_20_max": 0.242976,
99
+ "nauc_recall_at_20_std": 0.136018,
100
+ "nauc_recall_at_20_diff1": 0.283903,
101
+ "nauc_recall_at_100_max": 0.3418,
102
+ "nauc_recall_at_100_std": 0.267522,
103
+ "nauc_recall_at_100_diff1": 0.25846,
104
+ "nauc_recall_at_1000_max": 0.416446,
105
+ "nauc_recall_at_1000_std": 0.493094,
106
+ "nauc_recall_at_1000_diff1": 0.225555,
107
+ "nauc_precision_at_1_max": 0.249942,
108
+ "nauc_precision_at_1_std": 0.006952,
109
+ "nauc_precision_at_1_diff1": 0.453221,
110
+ "nauc_precision_at_3_max": 0.263711,
111
+ "nauc_precision_at_3_std": 0.058322,
112
+ "nauc_precision_at_3_diff1": 0.393448,
113
+ "nauc_precision_at_5_max": 0.302879,
114
+ "nauc_precision_at_5_std": 0.073869,
115
+ "nauc_precision_at_5_diff1": 0.371663,
116
+ "nauc_precision_at_10_max": 0.308058,
117
+ "nauc_precision_at_10_std": 0.11579,
118
+ "nauc_precision_at_10_diff1": 0.28148,
119
+ "nauc_precision_at_20_max": 0.279141,
120
+ "nauc_precision_at_20_std": 0.153639,
121
+ "nauc_precision_at_20_diff1": 0.214156,
122
+ "nauc_precision_at_100_max": 0.29061,
123
+ "nauc_precision_at_100_std": 0.202333,
124
+ "nauc_precision_at_100_diff1": 0.117873,
125
+ "nauc_precision_at_1000_max": 0.105752,
126
+ "nauc_precision_at_1000_std": 0.112284,
127
+ "nauc_precision_at_1000_diff1": -0.014649,
128
+ "nauc_mrr_at_1_max": 0.249942,
129
+ "nauc_mrr_at_1_std": 0.006952,
130
+ "nauc_mrr_at_1_diff1": 0.453221,
131
+ "nauc_mrr_at_3_max": 0.248904,
132
+ "nauc_mrr_at_3_std": 0.02546,
133
+ "nauc_mrr_at_3_diff1": 0.431861,
134
+ "nauc_mrr_at_5_max": 0.255427,
135
+ "nauc_mrr_at_5_std": 0.029409,
136
+ "nauc_mrr_at_5_diff1": 0.423598,
137
+ "nauc_mrr_at_10_max": 0.253892,
138
+ "nauc_mrr_at_10_std": 0.03343,
139
+ "nauc_mrr_at_10_diff1": 0.417139,
140
+ "nauc_mrr_at_20_max": 0.253093,
141
+ "nauc_mrr_at_20_std": 0.037446,
142
+ "nauc_mrr_at_20_diff1": 0.413098,
143
+ "nauc_mrr_at_100_max": 0.25553,
144
+ "nauc_mrr_at_100_std": 0.039507,
145
+ "nauc_mrr_at_100_diff1": 0.413141,
146
+ "nauc_mrr_at_1000_max": 0.255436,
147
+ "nauc_mrr_at_1000_std": 0.039336,
148
+ "nauc_mrr_at_1000_diff1": 0.413364,
149
+ "hit_rate_at_1": 0.20091,
150
+ "hit_rate_at_3": 0.31507,
151
+ "hit_rate_at_5": 0.36644,
152
+ "hit_rate_at_10": 0.43721,
153
+ "hit_rate_at_20": 0.50571,
154
+ "hit_rate_at_100": 0.67237,
155
+ "hit_rate_at_1000": 0.88128,
156
+ "main_score": 0.2728,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 40.26266860961914,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackRetrieval.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "1",
3
+ "task_name": "CQADupstackRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_10": 0.248232,
9
+ "main_score": 0.248232,
10
+ "hf_subset": "default",
11
+ "languages": [
12
+ "eng-Latn"
13
+ ]
14
+ }
15
+ ]
16
+ },
17
+ "evaluation_time": 578.8082549571991,
18
+ "kg_co2_emissions": NaN,
19
+ "date": 1775181024.480352
20
+ }
results/CQADupstackStatsRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "65ac3a16b8e91f9cee4c9828cc7c335575432a2a",
3
+ "task_name": "CQADupstackStatsRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.16871,
9
+ "ndcg_at_3": 0.19488,
10
+ "ndcg_at_5": 0.20896,
11
+ "ndcg_at_10": 0.2259,
12
+ "ndcg_at_20": 0.24396,
13
+ "ndcg_at_100": 0.26608,
14
+ "ndcg_at_1000": 0.29285,
15
+ "map_at_1": 0.15265,
16
+ "map_at_3": 0.18022,
17
+ "map_at_5": 0.18851,
18
+ "map_at_10": 0.19596,
19
+ "map_at_20": 0.20105,
20
+ "map_at_100": 0.20409,
21
+ "map_at_1000": 0.20493,
22
+ "recall_at_1": 0.15265,
23
+ "recall_at_3": 0.21437,
24
+ "recall_at_5": 0.24889,
25
+ "recall_at_10": 0.29985,
26
+ "recall_at_20": 0.36793,
27
+ "recall_at_100": 0.48433,
28
+ "recall_at_1000": 0.69076,
29
+ "accuracy": 0.15265,
30
+ "precision_at_1": 0.16871,
31
+ "precision_at_3": 0.08231,
32
+ "precision_at_5": 0.0589,
33
+ "precision_at_10": 0.0362,
34
+ "precision_at_20": 0.02239,
35
+ "precision_at_100": 0.00595,
36
+ "precision_at_1000": 0.00089,
37
+ "mrr_at_1": 0.168712,
38
+ "mrr_at_3": 0.199131,
39
+ "mrr_at_5": 0.208103,
40
+ "mrr_at_10": 0.215182,
41
+ "mrr_at_20": 0.220219,
42
+ "mrr_at_100": 0.22282,
43
+ "mrr_at_1000": 0.223636,
44
+ "nauc_ndcg_at_1_max": 0.193945,
45
+ "nauc_ndcg_at_1_std": 0.02935,
46
+ "nauc_ndcg_at_1_diff1": 0.541394,
47
+ "nauc_ndcg_at_3_max": 0.151446,
48
+ "nauc_ndcg_at_3_std": 0.044387,
49
+ "nauc_ndcg_at_3_diff1": 0.458819,
50
+ "nauc_ndcg_at_5_max": 0.139013,
51
+ "nauc_ndcg_at_5_std": 0.047854,
52
+ "nauc_ndcg_at_5_diff1": 0.438846,
53
+ "nauc_ndcg_at_10_max": 0.140994,
54
+ "nauc_ndcg_at_10_std": 0.053065,
55
+ "nauc_ndcg_at_10_diff1": 0.429048,
56
+ "nauc_ndcg_at_20_max": 0.153375,
57
+ "nauc_ndcg_at_20_std": 0.077596,
58
+ "nauc_ndcg_at_20_diff1": 0.412458,
59
+ "nauc_ndcg_at_100_max": 0.147931,
60
+ "nauc_ndcg_at_100_std": 0.097923,
61
+ "nauc_ndcg_at_100_diff1": 0.404013,
62
+ "nauc_ndcg_at_1000_max": 0.149112,
63
+ "nauc_ndcg_at_1000_std": 0.10332,
64
+ "nauc_ndcg_at_1000_diff1": 0.403942,
65
+ "nauc_map_at_1_max": 0.178521,
66
+ "nauc_map_at_1_std": 0.005095,
67
+ "nauc_map_at_1_diff1": 0.553877,
68
+ "nauc_map_at_3_max": 0.155793,
69
+ "nauc_map_at_3_std": 0.026135,
70
+ "nauc_map_at_3_diff1": 0.480951,
71
+ "nauc_map_at_5_max": 0.150277,
72
+ "nauc_map_at_5_std": 0.031667,
73
+ "nauc_map_at_5_diff1": 0.470359,
74
+ "nauc_map_at_10_max": 0.151083,
75
+ "nauc_map_at_10_std": 0.035986,
76
+ "nauc_map_at_10_diff1": 0.466097,
77
+ "nauc_map_at_20_max": 0.155653,
78
+ "nauc_map_at_20_std": 0.044281,
79
+ "nauc_map_at_20_diff1": 0.461418,
80
+ "nauc_map_at_100_max": 0.154659,
81
+ "nauc_map_at_100_std": 0.047142,
82
+ "nauc_map_at_100_diff1": 0.459857,
83
+ "nauc_map_at_1000_max": 0.154592,
84
+ "nauc_map_at_1000_std": 0.047534,
85
+ "nauc_map_at_1000_diff1": 0.459733,
86
+ "nauc_recall_at_1_max": 0.178521,
87
+ "nauc_recall_at_1_std": 0.005095,
88
+ "nauc_recall_at_1_diff1": 0.553877,
89
+ "nauc_recall_at_3_max": 0.122084,
90
+ "nauc_recall_at_3_std": 0.049333,
91
+ "nauc_recall_at_3_diff1": 0.399057,
92
+ "nauc_recall_at_5_max": 0.098131,
93
+ "nauc_recall_at_5_std": 0.060723,
94
+ "nauc_recall_at_5_diff1": 0.362893,
95
+ "nauc_recall_at_10_max": 0.100535,
96
+ "nauc_recall_at_10_std": 0.07292,
97
+ "nauc_recall_at_10_diff1": 0.334556,
98
+ "nauc_recall_at_20_max": 0.135992,
99
+ "nauc_recall_at_20_std": 0.140312,
100
+ "nauc_recall_at_20_diff1": 0.283028,
101
+ "nauc_recall_at_100_max": 0.107972,
102
+ "nauc_recall_at_100_std": 0.233417,
103
+ "nauc_recall_at_100_diff1": 0.24343,
104
+ "nauc_recall_at_1000_max": 0.097857,
105
+ "nauc_recall_at_1000_std": 0.315715,
106
+ "nauc_recall_at_1000_diff1": 0.187012,
107
+ "nauc_precision_at_1_max": 0.193945,
108
+ "nauc_precision_at_1_std": 0.02935,
109
+ "nauc_precision_at_1_diff1": 0.541394,
110
+ "nauc_precision_at_3_max": 0.149576,
111
+ "nauc_precision_at_3_std": 0.098451,
112
+ "nauc_precision_at_3_diff1": 0.383209,
113
+ "nauc_precision_at_5_max": 0.1297,
114
+ "nauc_precision_at_5_std": 0.126374,
115
+ "nauc_precision_at_5_diff1": 0.345276,
116
+ "nauc_precision_at_10_max": 0.144564,
117
+ "nauc_precision_at_10_std": 0.145789,
118
+ "nauc_precision_at_10_diff1": 0.336662,
119
+ "nauc_precision_at_20_max": 0.178016,
120
+ "nauc_precision_at_20_std": 0.226978,
121
+ "nauc_precision_at_20_diff1": 0.272527,
122
+ "nauc_precision_at_100_max": 0.157262,
123
+ "nauc_precision_at_100_std": 0.290269,
124
+ "nauc_precision_at_100_diff1": 0.213127,
125
+ "nauc_precision_at_1000_max": 0.157621,
126
+ "nauc_precision_at_1000_std": 0.284239,
127
+ "nauc_precision_at_1000_diff1": 0.129499,
128
+ "nauc_mrr_at_1_max": 0.193945,
129
+ "nauc_mrr_at_1_std": 0.02935,
130
+ "nauc_mrr_at_1_diff1": 0.541394,
131
+ "nauc_mrr_at_3_max": 0.167649,
132
+ "nauc_mrr_at_3_std": 0.0487,
133
+ "nauc_mrr_at_3_diff1": 0.475028,
134
+ "nauc_mrr_at_5_max": 0.16236,
135
+ "nauc_mrr_at_5_std": 0.052895,
136
+ "nauc_mrr_at_5_diff1": 0.458277,
137
+ "nauc_mrr_at_10_max": 0.162769,
138
+ "nauc_mrr_at_10_std": 0.054693,
139
+ "nauc_mrr_at_10_diff1": 0.454196,
140
+ "nauc_mrr_at_20_max": 0.165998,
141
+ "nauc_mrr_at_20_std": 0.062133,
142
+ "nauc_mrr_at_20_diff1": 0.448636,
143
+ "nauc_mrr_at_100_max": 0.165365,
144
+ "nauc_mrr_at_100_std": 0.064113,
145
+ "nauc_mrr_at_100_diff1": 0.447898,
146
+ "nauc_mrr_at_1000_max": 0.165347,
147
+ "nauc_mrr_at_1000_std": 0.064182,
148
+ "nauc_mrr_at_1000_diff1": 0.448013,
149
+ "hit_rate_at_1": 0.16871,
150
+ "hit_rate_at_3": 0.23926,
151
+ "hit_rate_at_5": 0.27914,
152
+ "hit_rate_at_10": 0.33282,
153
+ "hit_rate_at_20": 0.40491,
154
+ "hit_rate_at_100": 0.5184,
155
+ "hit_rate_at_1000": 0.7362,
156
+ "main_score": 0.2259,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 52.10785746574402,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackTexRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "46989137a86843e03a6195de44b09deda022eec7",
3
+ "task_name": "CQADupstackTexRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.12629,
9
+ "ndcg_at_3": 0.14518,
10
+ "ndcg_at_5": 0.15614,
11
+ "ndcg_at_10": 0.17047,
12
+ "ndcg_at_20": 0.18392,
13
+ "ndcg_at_100": 0.21056,
14
+ "ndcg_at_1000": 0.24332,
15
+ "map_at_1": 0.10346,
16
+ "map_at_3": 0.12901,
17
+ "map_at_5": 0.13609,
18
+ "map_at_10": 0.14233,
19
+ "map_at_20": 0.14617,
20
+ "map_at_100": 0.14984,
21
+ "map_at_1000": 0.15097,
22
+ "recall_at_1": 0.10346,
23
+ "recall_at_3": 0.1592,
24
+ "recall_at_5": 0.18755,
25
+ "recall_at_10": 0.23018,
26
+ "recall_at_20": 0.28056,
27
+ "recall_at_100": 0.41668,
28
+ "recall_at_1000": 0.65987,
29
+ "accuracy": 0.10346,
30
+ "precision_at_1": 0.12629,
31
+ "precision_at_3": 0.06733,
32
+ "precision_at_5": 0.04859,
33
+ "precision_at_10": 0.03066,
34
+ "precision_at_20": 0.01901,
35
+ "precision_at_100": 0.00592,
36
+ "precision_at_1000": 0.00103,
37
+ "mrr_at_1": 0.12629,
38
+ "mrr_at_3": 0.154451,
39
+ "mrr_at_5": 0.161866,
40
+ "mrr_at_10": 0.168962,
41
+ "mrr_at_20": 0.173027,
42
+ "mrr_at_100": 0.176492,
43
+ "mrr_at_1000": 0.177425,
44
+ "nauc_ndcg_at_1_max": 0.230319,
45
+ "nauc_ndcg_at_1_std": -0.021221,
46
+ "nauc_ndcg_at_1_diff1": 0.418441,
47
+ "nauc_ndcg_at_3_max": 0.201801,
48
+ "nauc_ndcg_at_3_std": 0.001944,
49
+ "nauc_ndcg_at_3_diff1": 0.342979,
50
+ "nauc_ndcg_at_5_max": 0.195268,
51
+ "nauc_ndcg_at_5_std": 0.001553,
52
+ "nauc_ndcg_at_5_diff1": 0.334014,
53
+ "nauc_ndcg_at_10_max": 0.190439,
54
+ "nauc_ndcg_at_10_std": 0.007219,
55
+ "nauc_ndcg_at_10_diff1": 0.322701,
56
+ "nauc_ndcg_at_20_max": 0.194566,
57
+ "nauc_ndcg_at_20_std": 0.018002,
58
+ "nauc_ndcg_at_20_diff1": 0.318386,
59
+ "nauc_ndcg_at_100_max": 0.203695,
60
+ "nauc_ndcg_at_100_std": 0.038701,
61
+ "nauc_ndcg_at_100_diff1": 0.303952,
62
+ "nauc_ndcg_at_1000_max": 0.213091,
63
+ "nauc_ndcg_at_1000_std": 0.052423,
64
+ "nauc_ndcg_at_1000_diff1": 0.300136,
65
+ "nauc_map_at_1_max": 0.212611,
66
+ "nauc_map_at_1_std": -0.014866,
67
+ "nauc_map_at_1_diff1": 0.418644,
68
+ "nauc_map_at_3_max": 0.202392,
69
+ "nauc_map_at_3_std": -0.000545,
70
+ "nauc_map_at_3_diff1": 0.361782,
71
+ "nauc_map_at_5_max": 0.197294,
72
+ "nauc_map_at_5_std": -0.001829,
73
+ "nauc_map_at_5_diff1": 0.354015,
74
+ "nauc_map_at_10_max": 0.195174,
75
+ "nauc_map_at_10_std": 0.00067,
76
+ "nauc_map_at_10_diff1": 0.348065,
77
+ "nauc_map_at_20_max": 0.196803,
78
+ "nauc_map_at_20_std": 0.00405,
79
+ "nauc_map_at_20_diff1": 0.346794,
80
+ "nauc_map_at_100_max": 0.198651,
81
+ "nauc_map_at_100_std": 0.007573,
82
+ "nauc_map_at_100_diff1": 0.344353,
83
+ "nauc_map_at_1000_max": 0.199103,
84
+ "nauc_map_at_1000_std": 0.008273,
85
+ "nauc_map_at_1000_diff1": 0.344289,
86
+ "nauc_recall_at_1_max": 0.212611,
87
+ "nauc_recall_at_1_std": -0.014866,
88
+ "nauc_recall_at_1_diff1": 0.418644,
89
+ "nauc_recall_at_3_max": 0.183492,
90
+ "nauc_recall_at_3_std": 0.011861,
91
+ "nauc_recall_at_3_diff1": 0.306821,
92
+ "nauc_recall_at_5_max": 0.168708,
93
+ "nauc_recall_at_5_std": 0.010617,
94
+ "nauc_recall_at_5_diff1": 0.28616,
95
+ "nauc_recall_at_10_max": 0.156201,
96
+ "nauc_recall_at_10_std": 0.022768,
97
+ "nauc_recall_at_10_diff1": 0.25403,
98
+ "nauc_recall_at_20_max": 0.169179,
99
+ "nauc_recall_at_20_std": 0.053138,
100
+ "nauc_recall_at_20_diff1": 0.243587,
101
+ "nauc_recall_at_100_max": 0.199026,
102
+ "nauc_recall_at_100_std": 0.126662,
103
+ "nauc_recall_at_100_diff1": 0.188811,
104
+ "nauc_recall_at_1000_max": 0.248388,
105
+ "nauc_recall_at_1000_std": 0.237109,
106
+ "nauc_recall_at_1000_diff1": 0.131819,
107
+ "nauc_precision_at_1_max": 0.230319,
108
+ "nauc_precision_at_1_std": -0.021221,
109
+ "nauc_precision_at_1_diff1": 0.418441,
110
+ "nauc_precision_at_3_max": 0.199272,
111
+ "nauc_precision_at_3_std": 0.004546,
112
+ "nauc_precision_at_3_diff1": 0.294366,
113
+ "nauc_precision_at_5_max": 0.19767,
114
+ "nauc_precision_at_5_std": 0.001837,
115
+ "nauc_precision_at_5_diff1": 0.274078,
116
+ "nauc_precision_at_10_max": 0.192538,
117
+ "nauc_precision_at_10_std": 0.019743,
118
+ "nauc_precision_at_10_diff1": 0.243372,
119
+ "nauc_precision_at_20_max": 0.218758,
120
+ "nauc_precision_at_20_std": 0.051382,
121
+ "nauc_precision_at_20_diff1": 0.232076,
122
+ "nauc_precision_at_100_max": 0.24738,
123
+ "nauc_precision_at_100_std": 0.119182,
124
+ "nauc_precision_at_100_diff1": 0.156865,
125
+ "nauc_precision_at_1000_max": 0.269522,
126
+ "nauc_precision_at_1000_std": 0.152014,
127
+ "nauc_precision_at_1000_diff1": 0.075664,
128
+ "nauc_mrr_at_1_max": 0.230319,
129
+ "nauc_mrr_at_1_std": -0.021221,
130
+ "nauc_mrr_at_1_diff1": 0.418441,
131
+ "nauc_mrr_at_3_max": 0.209084,
132
+ "nauc_mrr_at_3_std": -0.006969,
133
+ "nauc_mrr_at_3_diff1": 0.359461,
134
+ "nauc_mrr_at_5_max": 0.207291,
135
+ "nauc_mrr_at_5_std": -0.006225,
136
+ "nauc_mrr_at_5_diff1": 0.353709,
137
+ "nauc_mrr_at_10_max": 0.206325,
138
+ "nauc_mrr_at_10_std": -0.003229,
139
+ "nauc_mrr_at_10_diff1": 0.348398,
140
+ "nauc_mrr_at_20_max": 0.207924,
141
+ "nauc_mrr_at_20_std": 0.000245,
142
+ "nauc_mrr_at_20_diff1": 0.34649,
143
+ "nauc_mrr_at_100_max": 0.208891,
144
+ "nauc_mrr_at_100_std": 0.002827,
145
+ "nauc_mrr_at_100_diff1": 0.344718,
146
+ "nauc_mrr_at_1000_max": 0.208898,
147
+ "nauc_mrr_at_1000_std": 0.003074,
148
+ "nauc_mrr_at_1000_diff1": 0.344713,
149
+ "hit_rate_at_1": 0.12629,
150
+ "hit_rate_at_3": 0.19098,
151
+ "hit_rate_at_5": 0.22368,
152
+ "hit_rate_at_10": 0.27564,
153
+ "hit_rate_at_20": 0.33379,
154
+ "hit_rate_at_100": 0.48107,
155
+ "hit_rate_at_1000": 0.72299,
156
+ "main_score": 0.17047,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 94.38194489479065,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackUnixRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "6c6430d3a6d36f8d2a829195bc5dc94d7e063e53",
3
+ "task_name": "CQADupstackUnixRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.1875,
9
+ "ndcg_at_3": 0.20202,
10
+ "ndcg_at_5": 0.21352,
11
+ "ndcg_at_10": 0.23141,
12
+ "ndcg_at_20": 0.24568,
13
+ "ndcg_at_100": 0.27544,
14
+ "ndcg_at_1000": 0.30995,
15
+ "map_at_1": 0.16082,
16
+ "map_at_3": 0.18667,
17
+ "map_at_5": 0.19371,
18
+ "map_at_10": 0.2012,
19
+ "map_at_20": 0.20492,
20
+ "map_at_100": 0.20918,
21
+ "map_at_1000": 0.21044,
22
+ "recall_at_1": 0.16082,
23
+ "recall_at_3": 0.21627,
24
+ "recall_at_5": 0.24583,
25
+ "recall_at_10": 0.29911,
26
+ "recall_at_20": 0.35199,
27
+ "recall_at_100": 0.50141,
28
+ "recall_at_1000": 0.7554,
29
+ "accuracy": 0.16082,
30
+ "precision_at_1": 0.1875,
31
+ "precision_at_3": 0.08613,
32
+ "precision_at_5": 0.06045,
33
+ "precision_at_10": 0.03731,
34
+ "precision_at_20": 0.02234,
35
+ "precision_at_100": 0.00674,
36
+ "precision_at_1000": 0.00108,
37
+ "mrr_at_1": 0.1875,
38
+ "mrr_at_3": 0.212687,
39
+ "mrr_at_5": 0.220849,
40
+ "mrr_at_10": 0.229029,
41
+ "mrr_at_20": 0.23357,
42
+ "mrr_at_100": 0.237229,
43
+ "mrr_at_1000": 0.238267,
44
+ "nauc_ndcg_at_1_max": 0.235508,
45
+ "nauc_ndcg_at_1_std": -0.00789,
46
+ "nauc_ndcg_at_1_diff1": 0.453197,
47
+ "nauc_ndcg_at_3_max": 0.233247,
48
+ "nauc_ndcg_at_3_std": -0.015264,
49
+ "nauc_ndcg_at_3_diff1": 0.424772,
50
+ "nauc_ndcg_at_5_max": 0.224668,
51
+ "nauc_ndcg_at_5_std": -0.011427,
52
+ "nauc_ndcg_at_5_diff1": 0.416819,
53
+ "nauc_ndcg_at_10_max": 0.220202,
54
+ "nauc_ndcg_at_10_std": -0.010872,
55
+ "nauc_ndcg_at_10_diff1": 0.397325,
56
+ "nauc_ndcg_at_20_max": 0.213576,
57
+ "nauc_ndcg_at_20_std": -0.002254,
58
+ "nauc_ndcg_at_20_diff1": 0.387078,
59
+ "nauc_ndcg_at_100_max": 0.221617,
60
+ "nauc_ndcg_at_100_std": 0.022028,
61
+ "nauc_ndcg_at_100_diff1": 0.373225,
62
+ "nauc_ndcg_at_1000_max": 0.232937,
63
+ "nauc_ndcg_at_1000_std": 0.03737,
64
+ "nauc_ndcg_at_1000_diff1": 0.378613,
65
+ "nauc_map_at_1_max": 0.235139,
66
+ "nauc_map_at_1_std": -0.004266,
67
+ "nauc_map_at_1_diff1": 0.475218,
68
+ "nauc_map_at_3_max": 0.230905,
69
+ "nauc_map_at_3_std": -0.013706,
70
+ "nauc_map_at_3_diff1": 0.442346,
71
+ "nauc_map_at_5_max": 0.227251,
72
+ "nauc_map_at_5_std": -0.010523,
73
+ "nauc_map_at_5_diff1": 0.437152,
74
+ "nauc_map_at_10_max": 0.225256,
75
+ "nauc_map_at_10_std": -0.010791,
76
+ "nauc_map_at_10_diff1": 0.428374,
77
+ "nauc_map_at_20_max": 0.22307,
78
+ "nauc_map_at_20_std": -0.008209,
79
+ "nauc_map_at_20_diff1": 0.425342,
80
+ "nauc_map_at_100_max": 0.224613,
81
+ "nauc_map_at_100_std": -0.004437,
82
+ "nauc_map_at_100_diff1": 0.422874,
83
+ "nauc_map_at_1000_max": 0.225039,
84
+ "nauc_map_at_1000_std": -0.003566,
85
+ "nauc_map_at_1000_diff1": 0.422999,
86
+ "nauc_recall_at_1_max": 0.235139,
87
+ "nauc_recall_at_1_std": -0.004266,
88
+ "nauc_recall_at_1_diff1": 0.475218,
89
+ "nauc_recall_at_3_max": 0.221991,
90
+ "nauc_recall_at_3_std": -0.024864,
91
+ "nauc_recall_at_3_diff1": 0.392695,
92
+ "nauc_recall_at_5_max": 0.20901,
93
+ "nauc_recall_at_5_std": -0.012765,
94
+ "nauc_recall_at_5_diff1": 0.375281,
95
+ "nauc_recall_at_10_max": 0.194324,
96
+ "nauc_recall_at_10_std": -0.013007,
97
+ "nauc_recall_at_10_diff1": 0.322802,
98
+ "nauc_recall_at_20_max": 0.170339,
99
+ "nauc_recall_at_20_std": 0.013403,
100
+ "nauc_recall_at_20_diff1": 0.290312,
101
+ "nauc_recall_at_100_max": 0.194856,
102
+ "nauc_recall_at_100_std": 0.119342,
103
+ "nauc_recall_at_100_diff1": 0.226639,
104
+ "nauc_recall_at_1000_max": 0.279435,
105
+ "nauc_recall_at_1000_std": 0.312857,
106
+ "nauc_recall_at_1000_diff1": 0.226713,
107
+ "nauc_precision_at_1_max": 0.235508,
108
+ "nauc_precision_at_1_std": -0.00789,
109
+ "nauc_precision_at_1_diff1": 0.453197,
110
+ "nauc_precision_at_3_max": 0.232357,
111
+ "nauc_precision_at_3_std": -0.025605,
112
+ "nauc_precision_at_3_diff1": 0.377004,
113
+ "nauc_precision_at_5_max": 0.212915,
114
+ "nauc_precision_at_5_std": -0.011097,
115
+ "nauc_precision_at_5_diff1": 0.346767,
116
+ "nauc_precision_at_10_max": 0.19545,
117
+ "nauc_precision_at_10_std": 0.002039,
118
+ "nauc_precision_at_10_diff1": 0.285917,
119
+ "nauc_precision_at_20_max": 0.178619,
120
+ "nauc_precision_at_20_std": 0.035792,
121
+ "nauc_precision_at_20_diff1": 0.239847,
122
+ "nauc_precision_at_100_max": 0.179317,
123
+ "nauc_precision_at_100_std": 0.142527,
124
+ "nauc_precision_at_100_diff1": 0.127719,
125
+ "nauc_precision_at_1000_max": 0.168149,
126
+ "nauc_precision_at_1000_std": 0.17302,
127
+ "nauc_precision_at_1000_diff1": -0.014407,
128
+ "nauc_mrr_at_1_max": 0.235508,
129
+ "nauc_mrr_at_1_std": -0.00789,
130
+ "nauc_mrr_at_1_diff1": 0.453197,
131
+ "nauc_mrr_at_3_max": 0.233272,
132
+ "nauc_mrr_at_3_std": -0.013997,
133
+ "nauc_mrr_at_3_diff1": 0.417242,
134
+ "nauc_mrr_at_5_max": 0.228978,
135
+ "nauc_mrr_at_5_std": -0.010665,
136
+ "nauc_mrr_at_5_diff1": 0.412509,
137
+ "nauc_mrr_at_10_max": 0.229613,
138
+ "nauc_mrr_at_10_std": -0.009033,
139
+ "nauc_mrr_at_10_diff1": 0.403702,
140
+ "nauc_mrr_at_20_max": 0.227094,
141
+ "nauc_mrr_at_20_std": -0.006685,
142
+ "nauc_mrr_at_20_diff1": 0.399966,
143
+ "nauc_mrr_at_100_max": 0.228246,
144
+ "nauc_mrr_at_100_std": -0.004885,
145
+ "nauc_mrr_at_100_diff1": 0.398215,
146
+ "nauc_mrr_at_1000_max": 0.22854,
147
+ "nauc_mrr_at_1000_std": -0.004488,
148
+ "nauc_mrr_at_1000_diff1": 0.398571,
149
+ "hit_rate_at_1": 0.1875,
150
+ "hit_rate_at_3": 0.24627,
151
+ "hit_rate_at_5": 0.28265,
152
+ "hit_rate_at_10": 0.34515,
153
+ "hit_rate_at_20": 0.41231,
154
+ "hit_rate_at_100": 0.56437,
155
+ "hit_rate_at_1000": 0.81343,
156
+ "main_score": 0.23141,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 61.548882246017456,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWebmastersRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "160c094312a0e1facb97e55eeddb698c0abe3571",
3
+ "task_name": "CQADupstackWebmastersRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.19763,
9
+ "ndcg_at_3": 0.22609,
10
+ "ndcg_at_5": 0.23619,
11
+ "ndcg_at_10": 0.2497,
12
+ "ndcg_at_20": 0.267,
13
+ "ndcg_at_100": 0.29984,
14
+ "ndcg_at_1000": 0.3349,
15
+ "map_at_1": 0.15581,
16
+ "map_at_3": 0.19501,
17
+ "map_at_5": 0.20445,
18
+ "map_at_10": 0.21202,
19
+ "map_at_20": 0.21817,
20
+ "map_at_100": 0.22438,
21
+ "map_at_1000": 0.22631,
22
+ "recall_at_1": 0.15581,
23
+ "recall_at_3": 0.23618,
24
+ "recall_at_5": 0.26621,
25
+ "recall_at_10": 0.30985,
26
+ "recall_at_20": 0.37846,
27
+ "recall_at_100": 0.54288,
28
+ "recall_at_1000": 0.78266,
29
+ "accuracy": 0.15581,
30
+ "precision_at_1": 0.19763,
31
+ "precision_at_3": 0.10935,
32
+ "precision_at_5": 0.07826,
33
+ "precision_at_10": 0.04901,
34
+ "precision_at_20": 0.03182,
35
+ "precision_at_100": 0.01117,
36
+ "precision_at_1000": 0.002,
37
+ "mrr_at_1": 0.197628,
38
+ "mrr_at_3": 0.236495,
39
+ "mrr_at_5": 0.242918,
40
+ "mrr_at_10": 0.249497,
41
+ "mrr_at_20": 0.255018,
42
+ "mrr_at_100": 0.258935,
43
+ "mrr_at_1000": 0.259807,
44
+ "nauc_ndcg_at_1_max": 0.201705,
45
+ "nauc_ndcg_at_1_std": 0.107448,
46
+ "nauc_ndcg_at_1_diff1": 0.493954,
47
+ "nauc_ndcg_at_3_max": 0.184019,
48
+ "nauc_ndcg_at_3_std": 0.101061,
49
+ "nauc_ndcg_at_3_diff1": 0.461407,
50
+ "nauc_ndcg_at_5_max": 0.17386,
51
+ "nauc_ndcg_at_5_std": 0.108166,
52
+ "nauc_ndcg_at_5_diff1": 0.44803,
53
+ "nauc_ndcg_at_10_max": 0.175864,
54
+ "nauc_ndcg_at_10_std": 0.105507,
55
+ "nauc_ndcg_at_10_diff1": 0.4328,
56
+ "nauc_ndcg_at_20_max": 0.167714,
57
+ "nauc_ndcg_at_20_std": 0.121772,
58
+ "nauc_ndcg_at_20_diff1": 0.418709,
59
+ "nauc_ndcg_at_100_max": 0.179637,
60
+ "nauc_ndcg_at_100_std": 0.149747,
61
+ "nauc_ndcg_at_100_diff1": 0.415966,
62
+ "nauc_ndcg_at_1000_max": 0.184758,
63
+ "nauc_ndcg_at_1000_std": 0.153074,
64
+ "nauc_ndcg_at_1000_diff1": 0.41573,
65
+ "nauc_map_at_1_max": 0.227561,
66
+ "nauc_map_at_1_std": 0.089596,
67
+ "nauc_map_at_1_diff1": 0.553489,
68
+ "nauc_map_at_3_max": 0.208453,
69
+ "nauc_map_at_3_std": 0.087576,
70
+ "nauc_map_at_3_diff1": 0.500902,
71
+ "nauc_map_at_5_max": 0.200498,
72
+ "nauc_map_at_5_std": 0.092375,
73
+ "nauc_map_at_5_diff1": 0.48774,
74
+ "nauc_map_at_10_max": 0.201454,
75
+ "nauc_map_at_10_std": 0.095125,
76
+ "nauc_map_at_10_diff1": 0.478641,
77
+ "nauc_map_at_20_max": 0.197037,
78
+ "nauc_map_at_20_std": 0.102911,
79
+ "nauc_map_at_20_diff1": 0.4713,
80
+ "nauc_map_at_100_max": 0.194399,
81
+ "nauc_map_at_100_std": 0.111902,
82
+ "nauc_map_at_100_diff1": 0.468311,
83
+ "nauc_map_at_1000_max": 0.193745,
84
+ "nauc_map_at_1000_std": 0.113011,
85
+ "nauc_map_at_1000_diff1": 0.466866,
86
+ "nauc_recall_at_1_max": 0.227561,
87
+ "nauc_recall_at_1_std": 0.089596,
88
+ "nauc_recall_at_1_diff1": 0.553489,
89
+ "nauc_recall_at_3_max": 0.171597,
90
+ "nauc_recall_at_3_std": 0.076768,
91
+ "nauc_recall_at_3_diff1": 0.436383,
92
+ "nauc_recall_at_5_max": 0.154084,
93
+ "nauc_recall_at_5_std": 0.092255,
94
+ "nauc_recall_at_5_diff1": 0.40358,
95
+ "nauc_recall_at_10_max": 0.153191,
96
+ "nauc_recall_at_10_std": 0.092634,
97
+ "nauc_recall_at_10_diff1": 0.351641,
98
+ "nauc_recall_at_20_max": 0.116001,
99
+ "nauc_recall_at_20_std": 0.158175,
100
+ "nauc_recall_at_20_diff1": 0.307346,
101
+ "nauc_recall_at_100_max": 0.146337,
102
+ "nauc_recall_at_100_std": 0.289483,
103
+ "nauc_recall_at_100_diff1": 0.290313,
104
+ "nauc_recall_at_1000_max": 0.17724,
105
+ "nauc_recall_at_1000_std": 0.392765,
106
+ "nauc_recall_at_1000_diff1": 0.270171,
107
+ "nauc_precision_at_1_max": 0.201705,
108
+ "nauc_precision_at_1_std": 0.107448,
109
+ "nauc_precision_at_1_diff1": 0.493954,
110
+ "nauc_precision_at_3_max": 0.141528,
111
+ "nauc_precision_at_3_std": 0.105073,
112
+ "nauc_precision_at_3_diff1": 0.345284,
113
+ "nauc_precision_at_5_max": 0.109174,
114
+ "nauc_precision_at_5_std": 0.120323,
115
+ "nauc_precision_at_5_diff1": 0.276892,
116
+ "nauc_precision_at_10_max": 0.072076,
117
+ "nauc_precision_at_10_std": 0.147043,
118
+ "nauc_precision_at_10_diff1": 0.208958,
119
+ "nauc_precision_at_20_max": 0.003301,
120
+ "nauc_precision_at_20_std": 0.21514,
121
+ "nauc_precision_at_20_diff1": 0.127829,
122
+ "nauc_precision_at_100_max": -0.049338,
123
+ "nauc_precision_at_100_std": 0.278531,
124
+ "nauc_precision_at_100_diff1": 0.017098,
125
+ "nauc_precision_at_1000_max": -0.113012,
126
+ "nauc_precision_at_1000_std": 0.117685,
127
+ "nauc_precision_at_1000_diff1": -0.14554,
128
+ "nauc_mrr_at_1_max": 0.201705,
129
+ "nauc_mrr_at_1_std": 0.107448,
130
+ "nauc_mrr_at_1_diff1": 0.493954,
131
+ "nauc_mrr_at_3_max": 0.173501,
132
+ "nauc_mrr_at_3_std": 0.097718,
133
+ "nauc_mrr_at_3_diff1": 0.44435,
134
+ "nauc_mrr_at_5_max": 0.169973,
135
+ "nauc_mrr_at_5_std": 0.102439,
136
+ "nauc_mrr_at_5_diff1": 0.435404,
137
+ "nauc_mrr_at_10_max": 0.172777,
138
+ "nauc_mrr_at_10_std": 0.100613,
139
+ "nauc_mrr_at_10_diff1": 0.429638,
140
+ "nauc_mrr_at_20_max": 0.170446,
141
+ "nauc_mrr_at_20_std": 0.104798,
142
+ "nauc_mrr_at_20_diff1": 0.426531,
143
+ "nauc_mrr_at_100_max": 0.172948,
144
+ "nauc_mrr_at_100_std": 0.108193,
145
+ "nauc_mrr_at_100_diff1": 0.427083,
146
+ "nauc_mrr_at_1000_max": 0.17294,
147
+ "nauc_mrr_at_1000_std": 0.108148,
148
+ "nauc_mrr_at_1000_diff1": 0.427136,
149
+ "hit_rate_at_1": 0.19763,
150
+ "hit_rate_at_3": 0.28656,
151
+ "hit_rate_at_5": 0.31423,
152
+ "hit_rate_at_10": 0.36166,
153
+ "hit_rate_at_20": 0.44071,
154
+ "hit_rate_at_100": 0.60474,
155
+ "hit_rate_at_1000": 0.82213,
156
+ "main_score": 0.2497,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 21.669158935546875,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/CQADupstackWordpressRetrieval.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4ffe81d471b1924886b33c7567bfb200e9eec5c4",
3
+ "task_name": "CQADupstackWordpressRetrieval",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.122,
9
+ "ndcg_at_3": 0.16141,
10
+ "ndcg_at_5": 0.1759,
11
+ "ndcg_at_10": 0.19303,
12
+ "ndcg_at_20": 0.20939,
13
+ "ndcg_at_100": 0.23793,
14
+ "ndcg_at_1000": 0.27084,
15
+ "map_at_1": 0.10974,
16
+ "map_at_3": 0.14601,
17
+ "map_at_5": 0.15435,
18
+ "map_at_10": 0.16167,
19
+ "map_at_20": 0.1664,
20
+ "map_at_100": 0.17026,
21
+ "map_at_1000": 0.17136,
22
+ "recall_at_1": 0.10974,
23
+ "recall_at_3": 0.18965,
24
+ "recall_at_5": 0.22539,
25
+ "recall_at_10": 0.2764,
26
+ "recall_at_20": 0.33844,
27
+ "recall_at_100": 0.48838,
28
+ "recall_at_1000": 0.74369,
29
+ "accuracy": 0.10974,
30
+ "precision_at_1": 0.122,
31
+ "precision_at_3": 0.07332,
32
+ "precision_at_5": 0.0525,
33
+ "precision_at_10": 0.03253,
34
+ "precision_at_20": 0.02006,
35
+ "precision_at_100": 0.00597,
36
+ "precision_at_1000": 0.00096,
37
+ "mrr_at_1": 0.121996,
38
+ "mrr_at_3": 0.159889,
39
+ "mrr_at_5": 0.168669,
40
+ "mrr_at_10": 0.175591,
41
+ "mrr_at_20": 0.180107,
42
+ "mrr_at_100": 0.183785,
43
+ "mrr_at_1000": 0.184795,
44
+ "nauc_ndcg_at_1_max": 0.195498,
45
+ "nauc_ndcg_at_1_std": -0.059471,
46
+ "nauc_ndcg_at_1_diff1": 0.436445,
47
+ "nauc_ndcg_at_3_max": 0.128434,
48
+ "nauc_ndcg_at_3_std": -0.007535,
49
+ "nauc_ndcg_at_3_diff1": 0.304886,
50
+ "nauc_ndcg_at_5_max": 0.12511,
51
+ "nauc_ndcg_at_5_std": -0.006802,
52
+ "nauc_ndcg_at_5_diff1": 0.288306,
53
+ "nauc_ndcg_at_10_max": 0.131754,
54
+ "nauc_ndcg_at_10_std": 0.006221,
55
+ "nauc_ndcg_at_10_diff1": 0.273192,
56
+ "nauc_ndcg_at_20_max": 0.138004,
57
+ "nauc_ndcg_at_20_std": 0.002491,
58
+ "nauc_ndcg_at_20_diff1": 0.260138,
59
+ "nauc_ndcg_at_100_max": 0.14504,
60
+ "nauc_ndcg_at_100_std": 0.013659,
61
+ "nauc_ndcg_at_100_diff1": 0.262462,
62
+ "nauc_ndcg_at_1000_max": 0.144261,
63
+ "nauc_ndcg_at_1000_std": 0.03415,
64
+ "nauc_ndcg_at_1000_diff1": 0.271006,
65
+ "nauc_map_at_1_max": 0.180579,
66
+ "nauc_map_at_1_std": -0.070196,
67
+ "nauc_map_at_1_diff1": 0.441501,
68
+ "nauc_map_at_3_max": 0.133264,
69
+ "nauc_map_at_3_std": -0.026593,
70
+ "nauc_map_at_3_diff1": 0.331688,
71
+ "nauc_map_at_5_max": 0.13353,
72
+ "nauc_map_at_5_std": -0.024144,
73
+ "nauc_map_at_5_diff1": 0.321083,
74
+ "nauc_map_at_10_max": 0.137716,
75
+ "nauc_map_at_10_std": -0.016867,
76
+ "nauc_map_at_10_diff1": 0.314727,
77
+ "nauc_map_at_20_max": 0.140282,
78
+ "nauc_map_at_20_std": -0.017288,
79
+ "nauc_map_at_20_diff1": 0.310382,
80
+ "nauc_map_at_100_max": 0.142128,
81
+ "nauc_map_at_100_std": -0.014783,
82
+ "nauc_map_at_100_diff1": 0.311585,
83
+ "nauc_map_at_1000_max": 0.141751,
84
+ "nauc_map_at_1000_std": -0.013976,
85
+ "nauc_map_at_1000_diff1": 0.311687,
86
+ "nauc_recall_at_1_max": 0.180579,
87
+ "nauc_recall_at_1_std": -0.070196,
88
+ "nauc_recall_at_1_diff1": 0.441501,
89
+ "nauc_recall_at_3_max": 0.085733,
90
+ "nauc_recall_at_3_std": 0.011728,
91
+ "nauc_recall_at_3_diff1": 0.246733,
92
+ "nauc_recall_at_5_max": 0.088181,
93
+ "nauc_recall_at_5_std": 0.016066,
94
+ "nauc_recall_at_5_diff1": 0.211521,
95
+ "nauc_recall_at_10_max": 0.105782,
96
+ "nauc_recall_at_10_std": 0.04734,
97
+ "nauc_recall_at_10_diff1": 0.181323,
98
+ "nauc_recall_at_20_max": 0.123856,
99
+ "nauc_recall_at_20_std": 0.032875,
100
+ "nauc_recall_at_20_diff1": 0.1429,
101
+ "nauc_recall_at_100_max": 0.146237,
102
+ "nauc_recall_at_100_std": 0.068235,
103
+ "nauc_recall_at_100_diff1": 0.145626,
104
+ "nauc_recall_at_1000_max": 0.174113,
105
+ "nauc_recall_at_1000_std": 0.274934,
106
+ "nauc_recall_at_1000_diff1": 0.176475,
107
+ "nauc_precision_at_1_max": 0.195498,
108
+ "nauc_precision_at_1_std": -0.059471,
109
+ "nauc_precision_at_1_diff1": 0.436445,
110
+ "nauc_precision_at_3_max": 0.101494,
111
+ "nauc_precision_at_3_std": 0.039666,
112
+ "nauc_precision_at_3_diff1": 0.224942,
113
+ "nauc_precision_at_5_max": 0.101147,
114
+ "nauc_precision_at_5_std": 0.042223,
115
+ "nauc_precision_at_5_diff1": 0.195269,
116
+ "nauc_precision_at_10_max": 0.118604,
117
+ "nauc_precision_at_10_std": 0.070029,
118
+ "nauc_precision_at_10_diff1": 0.158614,
119
+ "nauc_precision_at_20_max": 0.129791,
120
+ "nauc_precision_at_20_std": 0.054065,
121
+ "nauc_precision_at_20_diff1": 0.124559,
122
+ "nauc_precision_at_100_max": 0.128013,
123
+ "nauc_precision_at_100_std": 0.083705,
124
+ "nauc_precision_at_100_diff1": 0.106863,
125
+ "nauc_precision_at_1000_max": 0.002782,
126
+ "nauc_precision_at_1000_std": 0.104374,
127
+ "nauc_precision_at_1000_diff1": 0.037189,
128
+ "nauc_mrr_at_1_max": 0.195498,
129
+ "nauc_mrr_at_1_std": -0.059471,
130
+ "nauc_mrr_at_1_diff1": 0.436445,
131
+ "nauc_mrr_at_3_max": 0.157307,
132
+ "nauc_mrr_at_3_std": -0.006856,
133
+ "nauc_mrr_at_3_diff1": 0.327136,
134
+ "nauc_mrr_at_5_max": 0.156103,
135
+ "nauc_mrr_at_5_std": -0.00517,
136
+ "nauc_mrr_at_5_diff1": 0.317749,
137
+ "nauc_mrr_at_10_max": 0.157007,
138
+ "nauc_mrr_at_10_std": -0.002098,
139
+ "nauc_mrr_at_10_diff1": 0.311015,
140
+ "nauc_mrr_at_20_max": 0.15746,
141
+ "nauc_mrr_at_20_std": -0.003828,
142
+ "nauc_mrr_at_20_diff1": 0.30755,
143
+ "nauc_mrr_at_100_max": 0.158759,
144
+ "nauc_mrr_at_100_std": -0.001935,
145
+ "nauc_mrr_at_100_diff1": 0.308359,
146
+ "nauc_mrr_at_1000_max": 0.158403,
147
+ "nauc_mrr_at_1000_std": -0.001162,
148
+ "nauc_mrr_at_1000_diff1": 0.308619,
149
+ "hit_rate_at_1": 0.122,
150
+ "hit_rate_at_3": 0.20887,
151
+ "hit_rate_at_5": 0.24769,
152
+ "hit_rate_at_10": 0.29945,
153
+ "hit_rate_at_20": 0.36229,
154
+ "hit_rate_at_100": 0.51941,
155
+ "hit_rate_at_1000": 0.77819,
156
+ "main_score": 0.19303,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 61.142884731292725,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ClimateFEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "47f2ac6acb640fc46020b02a5b59fdda04d39380",
3
+ "task_name": "ClimateFEVER",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.23779,
9
+ "ndcg_at_3": 0.21028,
10
+ "ndcg_at_5": 0.22071,
11
+ "ndcg_at_10": 0.24606,
12
+ "ndcg_at_20": 0.26744,
13
+ "ndcg_at_100": 0.30885,
14
+ "ndcg_at_1000": 0.34227,
15
+ "map_at_1": 0.10858,
16
+ "map_at_3": 0.15219,
17
+ "map_at_5": 0.16408,
18
+ "map_at_10": 0.17484,
19
+ "map_at_20": 0.18172,
20
+ "map_at_100": 0.18935,
21
+ "map_at_1000": 0.19107,
22
+ "recall_at_1": 0.10858,
23
+ "recall_at_3": 0.19344,
24
+ "recall_at_5": 0.23017,
25
+ "recall_at_10": 0.28747,
26
+ "recall_at_20": 0.34907,
27
+ "recall_at_100": 0.51,
28
+ "recall_at_1000": 0.69928,
29
+ "accuracy": 0.10858,
30
+ "precision_at_1": 0.23779,
31
+ "precision_at_3": 0.1544,
32
+ "precision_at_5": 0.11401,
33
+ "precision_at_10": 0.07381,
34
+ "precision_at_20": 0.04599,
35
+ "precision_at_100": 0.01413,
36
+ "precision_at_1000": 0.00203,
37
+ "mrr_at_1": 0.237785,
38
+ "mrr_at_3": 0.317372,
39
+ "mrr_at_5": 0.331412,
40
+ "mrr_at_10": 0.343342,
41
+ "mrr_at_20": 0.34862,
42
+ "mrr_at_100": 0.352126,
43
+ "mrr_at_1000": 0.35264,
44
+ "nauc_ndcg_at_1_max": 0.343121,
45
+ "nauc_ndcg_at_1_std": 0.191218,
46
+ "nauc_ndcg_at_1_diff1": 0.254735,
47
+ "nauc_ndcg_at_3_max": 0.357007,
48
+ "nauc_ndcg_at_3_std": 0.210569,
49
+ "nauc_ndcg_at_3_diff1": 0.209733,
50
+ "nauc_ndcg_at_5_max": 0.352518,
51
+ "nauc_ndcg_at_5_std": 0.213777,
52
+ "nauc_ndcg_at_5_diff1": 0.202488,
53
+ "nauc_ndcg_at_10_max": 0.370267,
54
+ "nauc_ndcg_at_10_std": 0.239493,
55
+ "nauc_ndcg_at_10_diff1": 0.198521,
56
+ "nauc_ndcg_at_20_max": 0.385889,
57
+ "nauc_ndcg_at_20_std": 0.257995,
58
+ "nauc_ndcg_at_20_diff1": 0.192411,
59
+ "nauc_ndcg_at_100_max": 0.406111,
60
+ "nauc_ndcg_at_100_std": 0.294372,
61
+ "nauc_ndcg_at_100_diff1": 0.18882,
62
+ "nauc_ndcg_at_1000_max": 0.41161,
63
+ "nauc_ndcg_at_1000_std": 0.303865,
64
+ "nauc_ndcg_at_1000_diff1": 0.19431,
65
+ "nauc_map_at_1_max": 0.322729,
66
+ "nauc_map_at_1_std": 0.140863,
67
+ "nauc_map_at_1_diff1": 0.281661,
68
+ "nauc_map_at_3_max": 0.336169,
69
+ "nauc_map_at_3_std": 0.175982,
70
+ "nauc_map_at_3_diff1": 0.228765,
71
+ "nauc_map_at_5_max": 0.335445,
72
+ "nauc_map_at_5_std": 0.182184,
73
+ "nauc_map_at_5_diff1": 0.223767,
74
+ "nauc_map_at_10_max": 0.345831,
75
+ "nauc_map_at_10_std": 0.197714,
76
+ "nauc_map_at_10_diff1": 0.221667,
77
+ "nauc_map_at_20_max": 0.353365,
78
+ "nauc_map_at_20_std": 0.206319,
79
+ "nauc_map_at_20_diff1": 0.219253,
80
+ "nauc_map_at_100_max": 0.360195,
81
+ "nauc_map_at_100_std": 0.217521,
82
+ "nauc_map_at_100_diff1": 0.217837,
83
+ "nauc_map_at_1000_max": 0.360775,
84
+ "nauc_map_at_1000_std": 0.218456,
85
+ "nauc_map_at_1000_diff1": 0.218179,
86
+ "nauc_recall_at_1_max": 0.322729,
87
+ "nauc_recall_at_1_std": 0.140863,
88
+ "nauc_recall_at_1_diff1": 0.281661,
89
+ "nauc_recall_at_3_max": 0.337002,
90
+ "nauc_recall_at_3_std": 0.206095,
91
+ "nauc_recall_at_3_diff1": 0.173432,
92
+ "nauc_recall_at_5_max": 0.319101,
93
+ "nauc_recall_at_5_std": 0.203534,
94
+ "nauc_recall_at_5_diff1": 0.157499,
95
+ "nauc_recall_at_10_max": 0.335447,
96
+ "nauc_recall_at_10_std": 0.241945,
97
+ "nauc_recall_at_10_diff1": 0.142356,
98
+ "nauc_recall_at_20_max": 0.360903,
99
+ "nauc_recall_at_20_std": 0.276128,
100
+ "nauc_recall_at_20_diff1": 0.122751,
101
+ "nauc_recall_at_100_max": 0.388122,
102
+ "nauc_recall_at_100_std": 0.361282,
103
+ "nauc_recall_at_100_diff1": 0.099536,
104
+ "nauc_recall_at_1000_max": 0.407208,
105
+ "nauc_recall_at_1000_std": 0.427122,
106
+ "nauc_recall_at_1000_diff1": 0.100633,
107
+ "nauc_precision_at_1_max": 0.343121,
108
+ "nauc_precision_at_1_std": 0.191218,
109
+ "nauc_precision_at_1_diff1": 0.254735,
110
+ "nauc_precision_at_3_max": 0.378706,
111
+ "nauc_precision_at_3_std": 0.265687,
112
+ "nauc_precision_at_3_diff1": 0.148487,
113
+ "nauc_precision_at_5_max": 0.349551,
114
+ "nauc_precision_at_5_std": 0.25789,
115
+ "nauc_precision_at_5_diff1": 0.134404,
116
+ "nauc_precision_at_10_max": 0.368934,
117
+ "nauc_precision_at_10_std": 0.305969,
118
+ "nauc_precision_at_10_diff1": 0.109505,
119
+ "nauc_precision_at_20_max": 0.376714,
120
+ "nauc_precision_at_20_std": 0.32974,
121
+ "nauc_precision_at_20_diff1": 0.081363,
122
+ "nauc_precision_at_100_max": 0.340024,
123
+ "nauc_precision_at_100_std": 0.366876,
124
+ "nauc_precision_at_100_diff1": 0.034085,
125
+ "nauc_precision_at_1000_max": 0.266471,
126
+ "nauc_precision_at_1000_std": 0.320882,
127
+ "nauc_precision_at_1000_diff1": 0.028323,
128
+ "nauc_mrr_at_1_max": 0.343121,
129
+ "nauc_mrr_at_1_std": 0.191218,
130
+ "nauc_mrr_at_1_diff1": 0.254735,
131
+ "nauc_mrr_at_3_max": 0.376881,
132
+ "nauc_mrr_at_3_std": 0.243461,
133
+ "nauc_mrr_at_3_diff1": 0.208239,
134
+ "nauc_mrr_at_5_max": 0.374758,
135
+ "nauc_mrr_at_5_std": 0.242674,
136
+ "nauc_mrr_at_5_diff1": 0.205474,
137
+ "nauc_mrr_at_10_max": 0.380805,
138
+ "nauc_mrr_at_10_std": 0.249146,
139
+ "nauc_mrr_at_10_diff1": 0.20531,
140
+ "nauc_mrr_at_20_max": 0.382975,
141
+ "nauc_mrr_at_20_std": 0.252044,
142
+ "nauc_mrr_at_20_diff1": 0.204181,
143
+ "nauc_mrr_at_100_max": 0.383097,
144
+ "nauc_mrr_at_100_std": 0.252133,
145
+ "nauc_mrr_at_100_diff1": 0.20584,
146
+ "nauc_mrr_at_1000_max": 0.382904,
147
+ "nauc_mrr_at_1000_std": 0.251905,
148
+ "nauc_mrr_at_1000_diff1": 0.205967,
149
+ "hit_rate_at_1": 0.23779,
150
+ "hit_rate_at_3": 0.42085,
151
+ "hit_rate_at_5": 0.48274,
152
+ "hit_rate_at_10": 0.57264,
153
+ "hit_rate_at_20": 0.64951,
154
+ "hit_rate_at_100": 0.78567,
155
+ "hit_rate_at_1000": 0.90489,
156
+ "main_score": 0.24606,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 5897.712790250778,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/DBPedia.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c0f706b76e590d620bd6618b3ca8efdd34e2d659",
3
+ "task_name": "DBPedia",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.4175,
9
+ "ndcg_at_3": 0.3451,
10
+ "ndcg_at_5": 0.31299,
11
+ "ndcg_at_10": 0.29583,
12
+ "ndcg_at_20": 0.28704,
13
+ "ndcg_at_100": 0.31611,
14
+ "ndcg_at_1000": 0.37593,
15
+ "map_at_1": 0.06698,
16
+ "map_at_3": 0.0981,
17
+ "map_at_5": 0.11035,
18
+ "map_at_10": 0.1303,
19
+ "map_at_20": 0.14787,
20
+ "map_at_100": 0.17235,
21
+ "map_at_1000": 0.18128,
22
+ "recall_at_1": 0.06698,
23
+ "recall_at_3": 0.10994,
24
+ "recall_at_5": 0.13216,
25
+ "recall_at_10": 0.17759,
26
+ "recall_at_20": 0.22854,
27
+ "recall_at_100": 0.352,
28
+ "recall_at_1000": 0.55837,
29
+ "accuracy": 0.06698,
30
+ "precision_at_1": 0.535,
31
+ "precision_at_3": 0.37833,
32
+ "precision_at_5": 0.3035,
33
+ "precision_at_10": 0.23475,
34
+ "precision_at_20": 0.1725,
35
+ "precision_at_100": 0.06918,
36
+ "precision_at_1000": 0.01469,
37
+ "mrr_at_1": 0.535,
38
+ "mrr_at_3": 0.600833,
39
+ "mrr_at_5": 0.611708,
40
+ "mrr_at_10": 0.621402,
41
+ "mrr_at_20": 0.624491,
42
+ "mrr_at_100": 0.62649,
43
+ "mrr_at_1000": 0.626779,
44
+ "nauc_ndcg_at_1_max": 0.455655,
45
+ "nauc_ndcg_at_1_std": 0.200973,
46
+ "nauc_ndcg_at_1_diff1": 0.369676,
47
+ "nauc_ndcg_at_3_max": 0.477214,
48
+ "nauc_ndcg_at_3_std": 0.245199,
49
+ "nauc_ndcg_at_3_diff1": 0.273717,
50
+ "nauc_ndcg_at_5_max": 0.493604,
51
+ "nauc_ndcg_at_5_std": 0.289825,
52
+ "nauc_ndcg_at_5_diff1": 0.283215,
53
+ "nauc_ndcg_at_10_max": 0.490094,
54
+ "nauc_ndcg_at_10_std": 0.324223,
55
+ "nauc_ndcg_at_10_diff1": 0.271712,
56
+ "nauc_ndcg_at_20_max": 0.472684,
57
+ "nauc_ndcg_at_20_std": 0.339017,
58
+ "nauc_ndcg_at_20_diff1": 0.284964,
59
+ "nauc_ndcg_at_100_max": 0.481062,
60
+ "nauc_ndcg_at_100_std": 0.37988,
61
+ "nauc_ndcg_at_100_diff1": 0.290831,
62
+ "nauc_ndcg_at_1000_max": 0.521607,
63
+ "nauc_ndcg_at_1000_std": 0.427666,
64
+ "nauc_ndcg_at_1000_diff1": 0.275182,
65
+ "nauc_map_at_1_max": 0.184884,
66
+ "nauc_map_at_1_std": -0.036467,
67
+ "nauc_map_at_1_diff1": 0.534701,
68
+ "nauc_map_at_3_max": 0.22491,
69
+ "nauc_map_at_3_std": 0.027844,
70
+ "nauc_map_at_3_diff1": 0.422314,
71
+ "nauc_map_at_5_max": 0.257815,
72
+ "nauc_map_at_5_std": 0.082563,
73
+ "nauc_map_at_5_diff1": 0.399695,
74
+ "nauc_map_at_10_max": 0.307682,
75
+ "nauc_map_at_10_std": 0.163341,
76
+ "nauc_map_at_10_diff1": 0.346129,
77
+ "nauc_map_at_20_max": 0.364625,
78
+ "nauc_map_at_20_std": 0.240144,
79
+ "nauc_map_at_20_diff1": 0.322409,
80
+ "nauc_map_at_100_max": 0.427206,
81
+ "nauc_map_at_100_std": 0.337604,
82
+ "nauc_map_at_100_diff1": 0.291311,
83
+ "nauc_map_at_1000_max": 0.438383,
84
+ "nauc_map_at_1000_std": 0.354183,
85
+ "nauc_map_at_1000_diff1": 0.282971,
86
+ "nauc_recall_at_1_max": 0.184884,
87
+ "nauc_recall_at_1_std": -0.036467,
88
+ "nauc_recall_at_1_diff1": 0.534701,
89
+ "nauc_recall_at_3_max": 0.168868,
90
+ "nauc_recall_at_3_std": 0.007221,
91
+ "nauc_recall_at_3_diff1": 0.359284,
92
+ "nauc_recall_at_5_max": 0.188961,
93
+ "nauc_recall_at_5_std": 0.064226,
94
+ "nauc_recall_at_5_diff1": 0.333752,
95
+ "nauc_recall_at_10_max": 0.22585,
96
+ "nauc_recall_at_10_std": 0.157655,
97
+ "nauc_recall_at_10_diff1": 0.239389,
98
+ "nauc_recall_at_20_max": 0.286777,
99
+ "nauc_recall_at_20_std": 0.231869,
100
+ "nauc_recall_at_20_diff1": 0.228465,
101
+ "nauc_recall_at_100_max": 0.367935,
102
+ "nauc_recall_at_100_std": 0.387773,
103
+ "nauc_recall_at_100_diff1": 0.204079,
104
+ "nauc_recall_at_1000_max": 0.368952,
105
+ "nauc_recall_at_1000_std": 0.451122,
106
+ "nauc_recall_at_1000_diff1": 0.121484,
107
+ "nauc_precision_at_1_max": 0.542824,
108
+ "nauc_precision_at_1_std": 0.234487,
109
+ "nauc_precision_at_1_diff1": 0.438662,
110
+ "nauc_precision_at_3_max": 0.502852,
111
+ "nauc_precision_at_3_std": 0.296039,
112
+ "nauc_precision_at_3_diff1": 0.104878,
113
+ "nauc_precision_at_5_max": 0.510481,
114
+ "nauc_precision_at_5_std": 0.392688,
115
+ "nauc_precision_at_5_diff1": 0.061735,
116
+ "nauc_precision_at_10_max": 0.5102,
117
+ "nauc_precision_at_10_std": 0.460824,
118
+ "nauc_precision_at_10_diff1": -0.009438,
119
+ "nauc_precision_at_20_max": 0.504609,
120
+ "nauc_precision_at_20_std": 0.501965,
121
+ "nauc_precision_at_20_diff1": -0.022055,
122
+ "nauc_precision_at_100_max": 0.393932,
123
+ "nauc_precision_at_100_std": 0.443163,
124
+ "nauc_precision_at_100_diff1": -0.071902,
125
+ "nauc_precision_at_1000_max": 0.131564,
126
+ "nauc_precision_at_1000_std": 0.183138,
127
+ "nauc_precision_at_1000_diff1": -0.169885,
128
+ "nauc_mrr_at_1_max": 0.542824,
129
+ "nauc_mrr_at_1_std": 0.234487,
130
+ "nauc_mrr_at_1_diff1": 0.438662,
131
+ "nauc_mrr_at_3_max": 0.558024,
132
+ "nauc_mrr_at_3_std": 0.244049,
133
+ "nauc_mrr_at_3_diff1": 0.42585,
134
+ "nauc_mrr_at_5_max": 0.555722,
135
+ "nauc_mrr_at_5_std": 0.248445,
136
+ "nauc_mrr_at_5_diff1": 0.426286,
137
+ "nauc_mrr_at_10_max": 0.559851,
138
+ "nauc_mrr_at_10_std": 0.257522,
139
+ "nauc_mrr_at_10_diff1": 0.423262,
140
+ "nauc_mrr_at_20_max": 0.560179,
141
+ "nauc_mrr_at_20_std": 0.254827,
142
+ "nauc_mrr_at_20_diff1": 0.426558,
143
+ "nauc_mrr_at_100_max": 0.559653,
144
+ "nauc_mrr_at_100_std": 0.254295,
145
+ "nauc_mrr_at_100_diff1": 0.426886,
146
+ "nauc_mrr_at_1000_max": 0.559666,
147
+ "nauc_mrr_at_1000_std": 0.254317,
148
+ "nauc_mrr_at_1000_diff1": 0.426871,
149
+ "hit_rate_at_1": 0.535,
150
+ "hit_rate_at_3": 0.6775,
151
+ "hit_rate_at_5": 0.7275,
152
+ "hit_rate_at_10": 0.7975,
153
+ "hit_rate_at_20": 0.84,
154
+ "hit_rate_at_100": 0.915,
155
+ "hit_rate_at_1000": 0.9775,
156
+ "main_score": 0.29583,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 4750.37805724144,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/EmotionClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4f58c6b202a23cf9a4da393831edf4f9183cad37",
3
+ "task_name": "EmotionClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.426,
11
+ "f1": 0.377881,
12
+ "f1_weighted": 0.448437,
13
+ "precision": 0.389149,
14
+ "precision_weighted": 0.522499,
15
+ "recall": 0.428431,
16
+ "recall_weighted": 0.426,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.375,
22
+ "f1": 0.340334,
23
+ "f1_weighted": 0.390046,
24
+ "precision": 0.354023,
25
+ "precision_weighted": 0.475535,
26
+ "recall": 0.40011,
27
+ "recall_weighted": 0.375,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.3865,
33
+ "f1": 0.345362,
34
+ "f1_weighted": 0.412465,
35
+ "precision": 0.357106,
36
+ "precision_weighted": 0.476589,
37
+ "recall": 0.387159,
38
+ "recall_weighted": 0.3865,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.3875,
44
+ "f1": 0.336269,
45
+ "f1_weighted": 0.415075,
46
+ "precision": 0.354845,
47
+ "precision_weighted": 0.491482,
48
+ "recall": 0.394926,
49
+ "recall_weighted": 0.3875,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.394,
55
+ "f1": 0.358739,
56
+ "f1_weighted": 0.421667,
57
+ "precision": 0.376911,
58
+ "precision_weighted": 0.504932,
59
+ "recall": 0.407643,
60
+ "recall_weighted": 0.394,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.4125,
66
+ "f1": 0.362078,
67
+ "f1_weighted": 0.435616,
68
+ "precision": 0.371521,
69
+ "precision_weighted": 0.505686,
70
+ "recall": 0.412291,
71
+ "recall_weighted": 0.4125,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.3915,
77
+ "f1": 0.349771,
78
+ "f1_weighted": 0.413247,
79
+ "precision": 0.366184,
80
+ "precision_weighted": 0.494832,
81
+ "recall": 0.410382,
82
+ "recall_weighted": 0.3915,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.3435,
88
+ "f1": 0.318846,
89
+ "f1_weighted": 0.358834,
90
+ "precision": 0.332929,
91
+ "precision_weighted": 0.441883,
92
+ "recall": 0.385154,
93
+ "recall_weighted": 0.3435,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.398,
99
+ "f1": 0.349592,
100
+ "f1_weighted": 0.420303,
101
+ "precision": 0.357782,
102
+ "precision_weighted": 0.482608,
103
+ "recall": 0.390937,
104
+ "recall_weighted": 0.398,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.3925,
110
+ "f1": 0.345382,
111
+ "f1_weighted": 0.42106,
112
+ "precision": 0.363256,
113
+ "precision_weighted": 0.506063,
114
+ "recall": 0.393749,
115
+ "recall_weighted": 0.3925,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.3907,
121
+ "f1": 0.348425,
122
+ "f1_weighted": 0.413675,
123
+ "precision": 0.362371,
124
+ "precision_weighted": 0.490211,
125
+ "recall": 0.401078,
126
+ "recall_weighted": 0.3907,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.3907,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 17.370490550994873,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/FEVER.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "bea83ef9e8fb933d90a2f1d5515737465d613e12",
3
+ "task_name": "FEVER",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.59511,
9
+ "ndcg_at_3": 0.66322,
10
+ "ndcg_at_5": 0.68352,
11
+ "ndcg_at_10": 0.69831,
12
+ "ndcg_at_20": 0.70672,
13
+ "ndcg_at_100": 0.71674,
14
+ "ndcg_at_1000": 0.72211,
15
+ "map_at_1": 0.5529,
16
+ "map_at_3": 0.62937,
17
+ "map_at_5": 0.6415,
18
+ "map_at_10": 0.64796,
19
+ "map_at_20": 0.65039,
20
+ "map_at_100": 0.65194,
21
+ "map_at_1000": 0.65216,
22
+ "recall_at_1": 0.5529,
23
+ "recall_at_3": 0.71647,
24
+ "recall_at_5": 0.76666,
25
+ "recall_at_10": 0.81175,
26
+ "recall_at_20": 0.84374,
27
+ "recall_at_100": 0.89442,
28
+ "recall_at_1000": 0.93437,
29
+ "accuracy": 0.5529,
30
+ "precision_at_1": 0.59511,
31
+ "precision_at_3": 0.25903,
32
+ "precision_at_5": 0.16682,
33
+ "precision_at_10": 0.08843,
34
+ "precision_at_20": 0.04609,
35
+ "precision_at_100": 0.00985,
36
+ "precision_at_1000": 0.00104,
37
+ "mrr_at_1": 0.59511,
38
+ "mrr_at_3": 0.674017,
39
+ "mrr_at_5": 0.685674,
40
+ "mrr_at_10": 0.691982,
41
+ "mrr_at_20": 0.694214,
42
+ "mrr_at_100": 0.695509,
43
+ "mrr_at_1000": 0.695649,
44
+ "nauc_ndcg_at_1_max": 0.322724,
45
+ "nauc_ndcg_at_1_std": -0.019391,
46
+ "nauc_ndcg_at_1_diff1": 0.616277,
47
+ "nauc_ndcg_at_3_max": 0.355358,
48
+ "nauc_ndcg_at_3_std": 0.032876,
49
+ "nauc_ndcg_at_3_diff1": 0.521015,
50
+ "nauc_ndcg_at_5_max": 0.36043,
51
+ "nauc_ndcg_at_5_std": 0.041642,
52
+ "nauc_ndcg_at_5_diff1": 0.514555,
53
+ "nauc_ndcg_at_10_max": 0.364069,
54
+ "nauc_ndcg_at_10_std": 0.053958,
55
+ "nauc_ndcg_at_10_diff1": 0.514048,
56
+ "nauc_ndcg_at_20_max": 0.363784,
57
+ "nauc_ndcg_at_20_std": 0.055851,
58
+ "nauc_ndcg_at_20_diff1": 0.514086,
59
+ "nauc_ndcg_at_100_max": 0.35874,
60
+ "nauc_ndcg_at_100_std": 0.054663,
61
+ "nauc_ndcg_at_100_diff1": 0.515184,
62
+ "nauc_ndcg_at_1000_max": 0.357326,
63
+ "nauc_ndcg_at_1000_std": 0.05338,
64
+ "nauc_ndcg_at_1000_diff1": 0.517587,
65
+ "nauc_map_at_1_max": 0.290232,
66
+ "nauc_map_at_1_std": -0.01011,
67
+ "nauc_map_at_1_diff1": 0.558808,
68
+ "nauc_map_at_3_max": 0.328664,
69
+ "nauc_map_at_3_std": 0.021271,
70
+ "nauc_map_at_3_diff1": 0.519885,
71
+ "nauc_map_at_5_max": 0.331893,
72
+ "nauc_map_at_5_std": 0.025416,
73
+ "nauc_map_at_5_diff1": 0.517455,
74
+ "nauc_map_at_10_max": 0.333295,
75
+ "nauc_map_at_10_std": 0.029761,
76
+ "nauc_map_at_10_diff1": 0.517879,
77
+ "nauc_map_at_20_max": 0.333245,
78
+ "nauc_map_at_20_std": 0.030183,
79
+ "nauc_map_at_20_diff1": 0.51796,
80
+ "nauc_map_at_100_max": 0.332674,
81
+ "nauc_map_at_100_std": 0.03012,
82
+ "nauc_map_at_100_diff1": 0.51811,
83
+ "nauc_map_at_1000_max": 0.332662,
84
+ "nauc_map_at_1000_std": 0.030097,
85
+ "nauc_map_at_1000_diff1": 0.518196,
86
+ "nauc_recall_at_1_max": 0.290232,
87
+ "nauc_recall_at_1_std": -0.01011,
88
+ "nauc_recall_at_1_diff1": 0.558808,
89
+ "nauc_recall_at_3_max": 0.375385,
90
+ "nauc_recall_at_3_std": 0.076253,
91
+ "nauc_recall_at_3_diff1": 0.439883,
92
+ "nauc_recall_at_5_max": 0.39338,
93
+ "nauc_recall_at_5_std": 0.108954,
94
+ "nauc_recall_at_5_diff1": 0.405277,
95
+ "nauc_recall_at_10_max": 0.41412,
96
+ "nauc_recall_at_10_std": 0.176006,
97
+ "nauc_recall_at_10_diff1": 0.376479,
98
+ "nauc_recall_at_20_max": 0.415416,
99
+ "nauc_recall_at_20_std": 0.206508,
100
+ "nauc_recall_at_20_diff1": 0.347593,
101
+ "nauc_recall_at_100_max": 0.371227,
102
+ "nauc_recall_at_100_std": 0.252345,
103
+ "nauc_recall_at_100_diff1": 0.280118,
104
+ "nauc_recall_at_1000_max": 0.329695,
105
+ "nauc_recall_at_1000_std": 0.322644,
106
+ "nauc_recall_at_1000_diff1": 0.210154,
107
+ "nauc_precision_at_1_max": 0.322724,
108
+ "nauc_precision_at_1_std": -0.019391,
109
+ "nauc_precision_at_1_diff1": 0.616277,
110
+ "nauc_precision_at_3_max": 0.446023,
111
+ "nauc_precision_at_3_std": 0.079605,
112
+ "nauc_precision_at_3_diff1": 0.487141,
113
+ "nauc_precision_at_5_max": 0.469546,
114
+ "nauc_precision_at_5_std": 0.113736,
115
+ "nauc_precision_at_5_diff1": 0.43583,
116
+ "nauc_precision_at_10_max": 0.494814,
117
+ "nauc_precision_at_10_std": 0.183242,
118
+ "nauc_precision_at_10_diff1": 0.393088,
119
+ "nauc_precision_at_20_max": 0.482286,
120
+ "nauc_precision_at_20_std": 0.204278,
121
+ "nauc_precision_at_20_diff1": 0.335491,
122
+ "nauc_precision_at_100_max": 0.371319,
123
+ "nauc_precision_at_100_std": 0.210493,
124
+ "nauc_precision_at_100_diff1": 0.181727,
125
+ "nauc_precision_at_1000_max": 0.242586,
126
+ "nauc_precision_at_1000_std": 0.172868,
127
+ "nauc_precision_at_1000_diff1": 0.049637,
128
+ "nauc_mrr_at_1_max": 0.322724,
129
+ "nauc_mrr_at_1_std": -0.019391,
130
+ "nauc_mrr_at_1_diff1": 0.616277,
131
+ "nauc_mrr_at_3_max": 0.373716,
132
+ "nauc_mrr_at_3_std": 0.013273,
133
+ "nauc_mrr_at_3_diff1": 0.58233,
134
+ "nauc_mrr_at_5_max": 0.377088,
135
+ "nauc_mrr_at_5_std": 0.017368,
136
+ "nauc_mrr_at_5_diff1": 0.581593,
137
+ "nauc_mrr_at_10_max": 0.378475,
138
+ "nauc_mrr_at_10_std": 0.020617,
139
+ "nauc_mrr_at_10_diff1": 0.582915,
140
+ "nauc_mrr_at_20_max": 0.37809,
141
+ "nauc_mrr_at_20_std": 0.020818,
142
+ "nauc_mrr_at_20_diff1": 0.583211,
143
+ "nauc_mrr_at_100_max": 0.377126,
144
+ "nauc_mrr_at_100_std": 0.020419,
145
+ "nauc_mrr_at_100_diff1": 0.583518,
146
+ "nauc_mrr_at_1000_max": 0.376977,
147
+ "nauc_mrr_at_1000_std": 0.02032,
148
+ "nauc_mrr_at_1000_diff1": 0.583548,
149
+ "hit_rate_at_1": 0.59511,
150
+ "hit_rate_at_3": 0.76793,
151
+ "hit_rate_at_5": 0.81878,
152
+ "hit_rate_at_10": 0.86454,
153
+ "hit_rate_at_20": 0.89619,
154
+ "hit_rate_at_100": 0.94374,
155
+ "hit_rate_at_1000": 0.97615,
156
+ "main_score": 0.69831,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 6016.38513469696,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/FiQA2018.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "27a168819829fe9bcd655c2df245fb19452e8e06",
3
+ "task_name": "FiQA2018",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.19444,
9
+ "ndcg_at_3": 0.17421,
10
+ "ndcg_at_5": 0.18489,
11
+ "ndcg_at_10": 0.20716,
12
+ "ndcg_at_20": 0.22924,
13
+ "ndcg_at_100": 0.27278,
14
+ "ndcg_at_1000": 0.31664,
15
+ "map_at_1": 0.09419,
16
+ "map_at_3": 0.12794,
17
+ "map_at_5": 0.14025,
18
+ "map_at_10": 0.15239,
19
+ "map_at_20": 0.15959,
20
+ "map_at_100": 0.16638,
21
+ "map_at_1000": 0.16845,
22
+ "recall_at_1": 0.09419,
23
+ "recall_at_3": 0.15688,
24
+ "recall_at_5": 0.19738,
25
+ "recall_at_10": 0.26208,
26
+ "recall_at_20": 0.33317,
27
+ "recall_at_100": 0.51961,
28
+ "recall_at_1000": 0.79336,
29
+ "accuracy": 0.09419,
30
+ "precision_at_1": 0.19444,
31
+ "precision_at_3": 0.1142,
32
+ "precision_at_5": 0.08796,
33
+ "precision_at_10": 0.05988,
34
+ "precision_at_20": 0.0385,
35
+ "precision_at_100": 0.01272,
36
+ "precision_at_1000": 0.00203,
37
+ "mrr_at_1": 0.194444,
38
+ "mrr_at_3": 0.240484,
39
+ "mrr_at_5": 0.252752,
40
+ "mrr_at_10": 0.263528,
41
+ "mrr_at_20": 0.268686,
42
+ "mrr_at_100": 0.273424,
43
+ "mrr_at_1000": 0.274264,
44
+ "nauc_ndcg_at_1_max": 0.085396,
45
+ "nauc_ndcg_at_1_std": -0.098157,
46
+ "nauc_ndcg_at_1_diff1": 0.280954,
47
+ "nauc_ndcg_at_3_max": 0.098421,
48
+ "nauc_ndcg_at_3_std": -0.088118,
49
+ "nauc_ndcg_at_3_diff1": 0.217215,
50
+ "nauc_ndcg_at_5_max": 0.079253,
51
+ "nauc_ndcg_at_5_std": -0.085253,
52
+ "nauc_ndcg_at_5_diff1": 0.203115,
53
+ "nauc_ndcg_at_10_max": 0.094562,
54
+ "nauc_ndcg_at_10_std": -0.049793,
55
+ "nauc_ndcg_at_10_diff1": 0.192416,
56
+ "nauc_ndcg_at_20_max": 0.108877,
57
+ "nauc_ndcg_at_20_std": -0.029945,
58
+ "nauc_ndcg_at_20_diff1": 0.185746,
59
+ "nauc_ndcg_at_100_max": 0.119487,
60
+ "nauc_ndcg_at_100_std": -0.01582,
61
+ "nauc_ndcg_at_100_diff1": 0.186088,
62
+ "nauc_ndcg_at_1000_max": 0.14455,
63
+ "nauc_ndcg_at_1000_std": 0.005886,
64
+ "nauc_ndcg_at_1000_diff1": 0.182528,
65
+ "nauc_map_at_1_max": 0.061388,
66
+ "nauc_map_at_1_std": -0.123295,
67
+ "nauc_map_at_1_diff1": 0.303957,
68
+ "nauc_map_at_3_max": 0.079236,
69
+ "nauc_map_at_3_std": -0.099114,
70
+ "nauc_map_at_3_diff1": 0.237395,
71
+ "nauc_map_at_5_max": 0.074919,
72
+ "nauc_map_at_5_std": -0.09161,
73
+ "nauc_map_at_5_diff1": 0.224816,
74
+ "nauc_map_at_10_max": 0.088382,
75
+ "nauc_map_at_10_std": -0.068697,
76
+ "nauc_map_at_10_diff1": 0.217232,
77
+ "nauc_map_at_20_max": 0.095154,
78
+ "nauc_map_at_20_std": -0.059207,
79
+ "nauc_map_at_20_diff1": 0.213481,
80
+ "nauc_map_at_100_max": 0.098858,
81
+ "nauc_map_at_100_std": -0.055959,
82
+ "nauc_map_at_100_diff1": 0.213501,
83
+ "nauc_map_at_1000_max": 0.100972,
84
+ "nauc_map_at_1000_std": -0.054037,
85
+ "nauc_map_at_1000_diff1": 0.212963,
86
+ "nauc_recall_at_1_max": 0.061388,
87
+ "nauc_recall_at_1_std": -0.123295,
88
+ "nauc_recall_at_1_diff1": 0.303957,
89
+ "nauc_recall_at_3_max": 0.066185,
90
+ "nauc_recall_at_3_std": -0.090075,
91
+ "nauc_recall_at_3_diff1": 0.177468,
92
+ "nauc_recall_at_5_max": 0.048001,
93
+ "nauc_recall_at_5_std": -0.080127,
94
+ "nauc_recall_at_5_diff1": 0.144526,
95
+ "nauc_recall_at_10_max": 0.089675,
96
+ "nauc_recall_at_10_std": -0.003944,
97
+ "nauc_recall_at_10_diff1": 0.125863,
98
+ "nauc_recall_at_20_max": 0.117459,
99
+ "nauc_recall_at_20_std": 0.041872,
100
+ "nauc_recall_at_20_diff1": 0.109021,
101
+ "nauc_recall_at_100_max": 0.133245,
102
+ "nauc_recall_at_100_std": 0.100071,
103
+ "nauc_recall_at_100_diff1": 0.110776,
104
+ "nauc_recall_at_1000_max": 0.307733,
105
+ "nauc_recall_at_1000_std": 0.332186,
106
+ "nauc_recall_at_1000_diff1": 0.054725,
107
+ "nauc_precision_at_1_max": 0.085396,
108
+ "nauc_precision_at_1_std": -0.098157,
109
+ "nauc_precision_at_1_diff1": 0.280954,
110
+ "nauc_precision_at_3_max": 0.123342,
111
+ "nauc_precision_at_3_std": -0.050796,
112
+ "nauc_precision_at_3_diff1": 0.15625,
113
+ "nauc_precision_at_5_max": 0.114479,
114
+ "nauc_precision_at_5_std": -0.016031,
115
+ "nauc_precision_at_5_diff1": 0.127815,
116
+ "nauc_precision_at_10_max": 0.155919,
117
+ "nauc_precision_at_10_std": 0.052964,
118
+ "nauc_precision_at_10_diff1": 0.099074,
119
+ "nauc_precision_at_20_max": 0.182719,
120
+ "nauc_precision_at_20_std": 0.087839,
121
+ "nauc_precision_at_20_diff1": 0.074914,
122
+ "nauc_precision_at_100_max": 0.187287,
123
+ "nauc_precision_at_100_std": 0.097828,
124
+ "nauc_precision_at_100_diff1": 0.020769,
125
+ "nauc_precision_at_1000_max": 0.216903,
126
+ "nauc_precision_at_1000_std": 0.132659,
127
+ "nauc_precision_at_1000_diff1": -0.02185,
128
+ "nauc_mrr_at_1_max": 0.085396,
129
+ "nauc_mrr_at_1_std": -0.098157,
130
+ "nauc_mrr_at_1_diff1": 0.280954,
131
+ "nauc_mrr_at_3_max": 0.088598,
132
+ "nauc_mrr_at_3_std": -0.086612,
133
+ "nauc_mrr_at_3_diff1": 0.224869,
134
+ "nauc_mrr_at_5_max": 0.086083,
135
+ "nauc_mrr_at_5_std": -0.086926,
136
+ "nauc_mrr_at_5_diff1": 0.21999,
137
+ "nauc_mrr_at_10_max": 0.089551,
138
+ "nauc_mrr_at_10_std": -0.079567,
139
+ "nauc_mrr_at_10_diff1": 0.215158,
140
+ "nauc_mrr_at_20_max": 0.091542,
141
+ "nauc_mrr_at_20_std": -0.076476,
142
+ "nauc_mrr_at_20_diff1": 0.214022,
143
+ "nauc_mrr_at_100_max": 0.092425,
144
+ "nauc_mrr_at_100_std": -0.076498,
145
+ "nauc_mrr_at_100_diff1": 0.215476,
146
+ "nauc_mrr_at_1000_max": 0.092728,
147
+ "nauc_mrr_at_1000_std": -0.07618,
148
+ "nauc_mrr_at_1000_diff1": 0.215609,
149
+ "hit_rate_at_1": 0.19444,
150
+ "hit_rate_at_3": 0.2963,
151
+ "hit_rate_at_5": 0.35031,
152
+ "hit_rate_at_10": 0.43056,
153
+ "hit_rate_at_20": 0.50617,
154
+ "hit_rate_at_100": 0.71451,
155
+ "hit_rate_at_1000": 0.91358,
156
+ "main_score": 0.20716,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 66.45291948318481,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/HotpotQA.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ab518f4d6fcca38d87c25209f94beba119d02014",
3
+ "task_name": "HotpotQA",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.53167,
9
+ "ndcg_at_3": 0.40026,
10
+ "ndcg_at_5": 0.41794,
11
+ "ndcg_at_10": 0.43572,
12
+ "ndcg_at_20": 0.44818,
13
+ "ndcg_at_100": 0.46985,
14
+ "ndcg_at_1000": 0.48897,
15
+ "map_at_1": 0.26583,
16
+ "map_at_3": 0.32984,
17
+ "map_at_5": 0.34128,
18
+ "map_at_10": 0.35012,
19
+ "map_at_20": 0.35428,
20
+ "map_at_100": 0.35803,
21
+ "map_at_1000": 0.35887,
22
+ "recall_at_1": 0.26583,
23
+ "recall_at_3": 0.36914,
24
+ "recall_at_5": 0.40419,
25
+ "recall_at_10": 0.44895,
26
+ "recall_at_20": 0.48906,
27
+ "recall_at_100": 0.58542,
28
+ "recall_at_1000": 0.71323,
29
+ "accuracy": 0.26583,
30
+ "precision_at_1": 0.53167,
31
+ "precision_at_3": 0.24609,
32
+ "precision_at_5": 0.16167,
33
+ "precision_at_10": 0.08979,
34
+ "precision_at_20": 0.04891,
35
+ "precision_at_100": 0.01171,
36
+ "precision_at_1000": 0.00143,
37
+ "mrr_at_1": 0.531668,
38
+ "mrr_at_3": 0.586766,
39
+ "mrr_at_5": 0.595854,
40
+ "mrr_at_10": 0.602144,
41
+ "mrr_at_20": 0.605048,
42
+ "mrr_at_100": 0.607029,
43
+ "mrr_at_1000": 0.607346,
44
+ "nauc_ndcg_at_1_max": 0.433229,
45
+ "nauc_ndcg_at_1_std": 0.05272,
46
+ "nauc_ndcg_at_1_diff1": 0.694146,
47
+ "nauc_ndcg_at_3_max": 0.385363,
48
+ "nauc_ndcg_at_3_std": 0.071543,
49
+ "nauc_ndcg_at_3_diff1": 0.531948,
50
+ "nauc_ndcg_at_5_max": 0.377125,
51
+ "nauc_ndcg_at_5_std": 0.077552,
52
+ "nauc_ndcg_at_5_diff1": 0.508338,
53
+ "nauc_ndcg_at_10_max": 0.369576,
54
+ "nauc_ndcg_at_10_std": 0.082616,
55
+ "nauc_ndcg_at_10_diff1": 0.493585,
56
+ "nauc_ndcg_at_20_max": 0.364503,
57
+ "nauc_ndcg_at_20_std": 0.086776,
58
+ "nauc_ndcg_at_20_diff1": 0.483721,
59
+ "nauc_ndcg_at_100_max": 0.357907,
60
+ "nauc_ndcg_at_100_std": 0.09626,
61
+ "nauc_ndcg_at_100_diff1": 0.471591,
62
+ "nauc_ndcg_at_1000_max": 0.355651,
63
+ "nauc_ndcg_at_1000_std": 0.100939,
64
+ "nauc_ndcg_at_1000_diff1": 0.465123,
65
+ "nauc_map_at_1_max": 0.433229,
66
+ "nauc_map_at_1_std": 0.05272,
67
+ "nauc_map_at_1_diff1": 0.694146,
68
+ "nauc_map_at_3_max": 0.363243,
69
+ "nauc_map_at_3_std": 0.067699,
70
+ "nauc_map_at_3_diff1": 0.500367,
71
+ "nauc_map_at_5_max": 0.35639,
72
+ "nauc_map_at_5_std": 0.072348,
73
+ "nauc_map_at_5_diff1": 0.482395,
74
+ "nauc_map_at_10_max": 0.351947,
75
+ "nauc_map_at_10_std": 0.07463,
76
+ "nauc_map_at_10_diff1": 0.474548,
77
+ "nauc_map_at_20_max": 0.350042,
78
+ "nauc_map_at_20_std": 0.075925,
79
+ "nauc_map_at_20_diff1": 0.470969,
80
+ "nauc_map_at_100_max": 0.348872,
81
+ "nauc_map_at_100_std": 0.077765,
82
+ "nauc_map_at_100_diff1": 0.468543,
83
+ "nauc_map_at_1000_max": 0.34882,
84
+ "nauc_map_at_1000_std": 0.078076,
85
+ "nauc_map_at_1000_diff1": 0.468235,
86
+ "nauc_recall_at_1_max": 0.433229,
87
+ "nauc_recall_at_1_std": 0.05272,
88
+ "nauc_recall_at_1_diff1": 0.694146,
89
+ "nauc_recall_at_3_max": 0.34645,
90
+ "nauc_recall_at_3_std": 0.080612,
91
+ "nauc_recall_at_3_diff1": 0.429863,
92
+ "nauc_recall_at_5_max": 0.32222,
93
+ "nauc_recall_at_5_std": 0.091314,
94
+ "nauc_recall_at_5_diff1": 0.373002,
95
+ "nauc_recall_at_10_max": 0.286532,
96
+ "nauc_recall_at_10_std": 0.10102,
97
+ "nauc_recall_at_10_diff1": 0.314338,
98
+ "nauc_recall_at_20_max": 0.254315,
99
+ "nauc_recall_at_20_std": 0.109253,
100
+ "nauc_recall_at_20_diff1": 0.264912,
101
+ "nauc_recall_at_100_max": 0.201326,
102
+ "nauc_recall_at_100_std": 0.145151,
103
+ "nauc_recall_at_100_diff1": 0.179968,
104
+ "nauc_recall_at_1000_max": 0.135616,
105
+ "nauc_recall_at_1000_std": 0.178026,
106
+ "nauc_recall_at_1000_diff1": 0.055121,
107
+ "nauc_precision_at_1_max": 0.433229,
108
+ "nauc_precision_at_1_std": 0.05272,
109
+ "nauc_precision_at_1_diff1": 0.694146,
110
+ "nauc_precision_at_3_max": 0.34645,
111
+ "nauc_precision_at_3_std": 0.080612,
112
+ "nauc_precision_at_3_diff1": 0.429863,
113
+ "nauc_precision_at_5_max": 0.32222,
114
+ "nauc_precision_at_5_std": 0.091314,
115
+ "nauc_precision_at_5_diff1": 0.373002,
116
+ "nauc_precision_at_10_max": 0.286532,
117
+ "nauc_precision_at_10_std": 0.10102,
118
+ "nauc_precision_at_10_diff1": 0.314338,
119
+ "nauc_precision_at_20_max": 0.254315,
120
+ "nauc_precision_at_20_std": 0.109253,
121
+ "nauc_precision_at_20_diff1": 0.264912,
122
+ "nauc_precision_at_100_max": 0.201326,
123
+ "nauc_precision_at_100_std": 0.145151,
124
+ "nauc_precision_at_100_diff1": 0.179968,
125
+ "nauc_precision_at_1000_max": 0.135616,
126
+ "nauc_precision_at_1000_std": 0.178026,
127
+ "nauc_precision_at_1000_diff1": 0.055121,
128
+ "nauc_mrr_at_1_max": 0.433229,
129
+ "nauc_mrr_at_1_std": 0.05272,
130
+ "nauc_mrr_at_1_diff1": 0.694146,
131
+ "nauc_mrr_at_3_max": 0.443313,
132
+ "nauc_mrr_at_3_std": 0.066702,
133
+ "nauc_mrr_at_3_diff1": 0.65952,
134
+ "nauc_mrr_at_5_max": 0.444078,
135
+ "nauc_mrr_at_5_std": 0.068542,
136
+ "nauc_mrr_at_5_diff1": 0.656499,
137
+ "nauc_mrr_at_10_max": 0.443617,
138
+ "nauc_mrr_at_10_std": 0.069915,
139
+ "nauc_mrr_at_10_diff1": 0.655073,
140
+ "nauc_mrr_at_20_max": 0.443761,
141
+ "nauc_mrr_at_20_std": 0.070968,
142
+ "nauc_mrr_at_20_diff1": 0.654569,
143
+ "nauc_mrr_at_100_max": 0.443658,
144
+ "nauc_mrr_at_100_std": 0.071312,
145
+ "nauc_mrr_at_100_diff1": 0.654804,
146
+ "nauc_mrr_at_1000_max": 0.4436,
147
+ "nauc_mrr_at_1000_std": 0.071272,
148
+ "nauc_mrr_at_1000_diff1": 0.654842,
149
+ "hit_rate_at_1": 0.53167,
150
+ "hit_rate_at_3": 0.6551,
151
+ "hit_rate_at_5": 0.69507,
152
+ "hit_rate_at_10": 0.7422,
153
+ "hit_rate_at_20": 0.78285,
154
+ "hit_rate_at_100": 0.86226,
155
+ "hit_rate_at_1000": 0.93788,
156
+ "main_score": 0.43572,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 5434.81702709198,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/ImdbClassification.json ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "3d86128a09e091d6018b6d26cad27f2739fc2db7",
3
+ "task_name": "ImdbClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.7298,
11
+ "f1": 0.728365,
12
+ "f1_weighted": 0.728365,
13
+ "precision": 0.73476,
14
+ "precision_weighted": 0.73476,
15
+ "recall": 0.7298,
16
+ "recall_weighted": 0.7298,
17
+ "ap": 0.67669,
18
+ "ap_weighted": 0.67669
19
+ },
20
+ {
21
+ "accuracy": 0.71244,
22
+ "f1": 0.709224,
23
+ "f1_weighted": 0.709224,
24
+ "precision": 0.722272,
25
+ "precision_weighted": 0.722272,
26
+ "recall": 0.71244,
27
+ "recall_weighted": 0.71244,
28
+ "ap": 0.663371,
29
+ "ap_weighted": 0.663371
30
+ },
31
+ {
32
+ "accuracy": 0.61212,
33
+ "f1": 0.608611,
34
+ "f1_weighted": 0.608611,
35
+ "precision": 0.61629,
36
+ "precision_weighted": 0.61629,
37
+ "recall": 0.61212,
38
+ "recall_weighted": 0.61212,
39
+ "ap": 0.566629,
40
+ "ap_weighted": 0.566629
41
+ },
42
+ {
43
+ "accuracy": 0.72956,
44
+ "f1": 0.729558,
45
+ "f1_weighted": 0.729558,
46
+ "precision": 0.729567,
47
+ "precision_weighted": 0.729567,
48
+ "recall": 0.72956,
49
+ "recall_weighted": 0.72956,
50
+ "ap": 0.667779,
51
+ "ap_weighted": 0.667779
52
+ },
53
+ {
54
+ "accuracy": 0.67284,
55
+ "f1": 0.67284,
56
+ "f1_weighted": 0.67284,
57
+ "precision": 0.67284,
58
+ "precision_weighted": 0.67284,
59
+ "recall": 0.67284,
60
+ "recall_weighted": 0.67284,
61
+ "ap": 0.616282,
62
+ "ap_weighted": 0.616282
63
+ },
64
+ {
65
+ "accuracy": 0.59808,
66
+ "f1": 0.590909,
67
+ "f1_weighted": 0.590909,
68
+ "precision": 0.605476,
69
+ "precision_weighted": 0.605476,
70
+ "recall": 0.59808,
71
+ "recall_weighted": 0.59808,
72
+ "ap": 0.556646,
73
+ "ap_weighted": 0.556646
74
+ },
75
+ {
76
+ "accuracy": 0.62644,
77
+ "f1": 0.625089,
78
+ "f1_weighted": 0.625089,
79
+ "precision": 0.62829,
80
+ "precision_weighted": 0.62829,
81
+ "recall": 0.62644,
82
+ "recall_weighted": 0.62644,
83
+ "ap": 0.577493,
84
+ "ap_weighted": 0.577493
85
+ },
86
+ {
87
+ "accuracy": 0.67036,
88
+ "f1": 0.667723,
89
+ "f1_weighted": 0.667723,
90
+ "precision": 0.675945,
91
+ "precision_weighted": 0.675945,
92
+ "recall": 0.67036,
93
+ "recall_weighted": 0.67036,
94
+ "ap": 0.609814,
95
+ "ap_weighted": 0.609814
96
+ },
97
+ {
98
+ "accuracy": 0.69948,
99
+ "f1": 0.699284,
100
+ "f1_weighted": 0.699284,
101
+ "precision": 0.700003,
102
+ "precision_weighted": 0.700003,
103
+ "recall": 0.69948,
104
+ "recall_weighted": 0.69948,
105
+ "ap": 0.637597,
106
+ "ap_weighted": 0.637597
107
+ },
108
+ {
109
+ "accuracy": 0.674,
110
+ "f1": 0.671423,
111
+ "f1_weighted": 0.671423,
112
+ "precision": 0.679635,
113
+ "precision_weighted": 0.679635,
114
+ "recall": 0.674,
115
+ "recall_weighted": 0.674,
116
+ "ap": 0.61272,
117
+ "ap_weighted": 0.61272
118
+ }
119
+ ],
120
+ "accuracy": 0.672512,
121
+ "f1": 0.670303,
122
+ "f1_weighted": 0.670303,
123
+ "precision": 0.676508,
124
+ "precision_weighted": 0.676508,
125
+ "recall": 0.672512,
126
+ "recall_weighted": 0.672512,
127
+ "ap": 0.618502,
128
+ "ap_weighted": 0.618502,
129
+ "main_score": 0.672512,
130
+ "hf_subset": "default",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ]
136
+ },
137
+ "evaluation_time": 206.38175559043884,
138
+ "kg_co2_emissions": null,
139
+ "date": null
140
+ }
results/MSMARCO.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "c5a29a104738b98a9e76336939199e264163d4a0",
3
+ "task_name": "MSMARCO",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "dev": [
7
+ {
8
+ "ndcg_at_1": 0.13625,
9
+ "ndcg_at_3": 0.2019,
10
+ "ndcg_at_5": 0.22864,
11
+ "ndcg_at_10": 0.25726,
12
+ "ndcg_at_20": 0.27919,
13
+ "ndcg_at_100": 0.31389,
14
+ "ndcg_at_1000": 0.33934,
15
+ "map_at_1": 0.13321,
16
+ "map_at_3": 0.18416,
17
+ "map_at_5": 0.19895,
18
+ "map_at_10": 0.2109,
19
+ "map_at_20": 0.21701,
20
+ "map_at_100": 0.22177,
21
+ "map_at_1000": 0.22267,
22
+ "recall_at_1": 0.13321,
23
+ "recall_at_3": 0.25008,
24
+ "recall_at_5": 0.31439,
25
+ "recall_at_10": 0.40142,
26
+ "recall_at_20": 0.48681,
27
+ "recall_at_100": 0.67248,
28
+ "recall_at_1000": 0.87366,
29
+ "accuracy": 0.13321,
30
+ "precision_at_1": 0.13625,
31
+ "precision_at_3": 0.08567,
32
+ "precision_at_5": 0.0649,
33
+ "precision_at_10": 0.0417,
34
+ "precision_at_20": 0.02534,
35
+ "precision_at_100": 0.00706,
36
+ "precision_at_1000": 0.00093,
37
+ "mrr_at_1": 0.136246,
38
+ "mrr_at_3": 0.187512,
39
+ "mrr_at_5": 0.202627,
40
+ "mrr_at_10": 0.214528,
41
+ "mrr_at_20": 0.220597,
42
+ "mrr_at_100": 0.225219,
43
+ "mrr_at_1000": 0.226072,
44
+ "nauc_ndcg_at_1_max": 0.093399,
45
+ "nauc_ndcg_at_1_std": -0.08675,
46
+ "nauc_ndcg_at_1_diff1": 0.383353,
47
+ "nauc_ndcg_at_3_max": 0.082612,
48
+ "nauc_ndcg_at_3_std": -0.094617,
49
+ "nauc_ndcg_at_3_diff1": 0.318672,
50
+ "nauc_ndcg_at_5_max": 0.087284,
51
+ "nauc_ndcg_at_5_std": -0.087053,
52
+ "nauc_ndcg_at_5_diff1": 0.307274,
53
+ "nauc_ndcg_at_10_max": 0.098624,
54
+ "nauc_ndcg_at_10_std": -0.071552,
55
+ "nauc_ndcg_at_10_diff1": 0.295866,
56
+ "nauc_ndcg_at_20_max": 0.104107,
57
+ "nauc_ndcg_at_20_std": -0.056058,
58
+ "nauc_ndcg_at_20_diff1": 0.294653,
59
+ "nauc_ndcg_at_100_max": 0.118627,
60
+ "nauc_ndcg_at_100_std": -0.022275,
61
+ "nauc_ndcg_at_100_diff1": 0.29013,
62
+ "nauc_ndcg_at_1000_max": 0.122129,
63
+ "nauc_ndcg_at_1000_std": -0.020528,
64
+ "nauc_ndcg_at_1000_diff1": 0.294577,
65
+ "nauc_map_at_1_max": 0.092532,
66
+ "nauc_map_at_1_std": -0.086836,
67
+ "nauc_map_at_1_diff1": 0.383733,
68
+ "nauc_map_at_3_max": 0.084415,
69
+ "nauc_map_at_3_std": -0.093965,
70
+ "nauc_map_at_3_diff1": 0.33129,
71
+ "nauc_map_at_5_max": 0.086949,
72
+ "nauc_map_at_5_std": -0.0898,
73
+ "nauc_map_at_5_diff1": 0.324339,
74
+ "nauc_map_at_10_max": 0.09202,
75
+ "nauc_map_at_10_std": -0.082923,
76
+ "nauc_map_at_10_diff1": 0.319518,
77
+ "nauc_map_at_20_max": 0.093919,
78
+ "nauc_map_at_20_std": -0.078417,
79
+ "nauc_map_at_20_diff1": 0.319325,
80
+ "nauc_map_at_100_max": 0.09599,
81
+ "nauc_map_at_100_std": -0.073679,
82
+ "nauc_map_at_100_diff1": 0.318714,
83
+ "nauc_map_at_1000_max": 0.096211,
84
+ "nauc_map_at_1000_std": -0.073447,
85
+ "nauc_map_at_1000_diff1": 0.318925,
86
+ "nauc_recall_at_1_max": 0.092532,
87
+ "nauc_recall_at_1_std": -0.086836,
88
+ "nauc_recall_at_1_diff1": 0.383733,
89
+ "nauc_recall_at_3_max": 0.078584,
90
+ "nauc_recall_at_3_std": -0.096077,
91
+ "nauc_recall_at_3_diff1": 0.288074,
92
+ "nauc_recall_at_5_max": 0.088485,
93
+ "nauc_recall_at_5_std": -0.079656,
94
+ "nauc_recall_at_5_diff1": 0.266764,
95
+ "nauc_recall_at_10_max": 0.115682,
96
+ "nauc_recall_at_10_std": -0.042055,
97
+ "nauc_recall_at_10_diff1": 0.23882,
98
+ "nauc_recall_at_20_max": 0.131926,
99
+ "nauc_recall_at_20_std": 0.008761,
100
+ "nauc_recall_at_20_diff1": 0.232895,
101
+ "nauc_recall_at_100_max": 0.215249,
102
+ "nauc_recall_at_100_std": 0.207541,
103
+ "nauc_recall_at_100_diff1": 0.191877,
104
+ "nauc_recall_at_1000_max": 0.371403,
105
+ "nauc_recall_at_1000_std": 0.48817,
106
+ "nauc_recall_at_1000_diff1": 0.147312,
107
+ "nauc_precision_at_1_max": 0.093399,
108
+ "nauc_precision_at_1_std": -0.08675,
109
+ "nauc_precision_at_1_diff1": 0.383353,
110
+ "nauc_precision_at_3_max": 0.081004,
111
+ "nauc_precision_at_3_std": -0.095384,
112
+ "nauc_precision_at_3_diff1": 0.288779,
113
+ "nauc_precision_at_5_max": 0.090831,
114
+ "nauc_precision_at_5_std": -0.078907,
115
+ "nauc_precision_at_5_diff1": 0.266908,
116
+ "nauc_precision_at_10_max": 0.121537,
117
+ "nauc_precision_at_10_std": -0.035889,
118
+ "nauc_precision_at_10_diff1": 0.238216,
119
+ "nauc_precision_at_20_max": 0.139064,
120
+ "nauc_precision_at_20_std": 0.016142,
121
+ "nauc_precision_at_20_diff1": 0.227652,
122
+ "nauc_precision_at_100_max": 0.220797,
123
+ "nauc_precision_at_100_std": 0.208055,
124
+ "nauc_precision_at_100_diff1": 0.174679,
125
+ "nauc_precision_at_1000_max": 0.299085,
126
+ "nauc_precision_at_1000_std": 0.347796,
127
+ "nauc_precision_at_1000_diff1": 0.082151,
128
+ "nauc_mrr_at_1_max": 0.093399,
129
+ "nauc_mrr_at_1_std": -0.08675,
130
+ "nauc_mrr_at_1_diff1": 0.383353,
131
+ "nauc_mrr_at_3_max": 0.083923,
132
+ "nauc_mrr_at_3_std": -0.093443,
133
+ "nauc_mrr_at_3_diff1": 0.331226,
134
+ "nauc_mrr_at_5_max": 0.086591,
135
+ "nauc_mrr_at_5_std": -0.08909,
136
+ "nauc_mrr_at_5_diff1": 0.323781,
137
+ "nauc_mrr_at_10_max": 0.091846,
138
+ "nauc_mrr_at_10_std": -0.081979,
139
+ "nauc_mrr_at_10_diff1": 0.318344,
140
+ "nauc_mrr_at_20_max": 0.093517,
141
+ "nauc_mrr_at_20_std": -0.0774,
142
+ "nauc_mrr_at_20_diff1": 0.318073,
143
+ "nauc_mrr_at_100_max": 0.095405,
144
+ "nauc_mrr_at_100_std": -0.073061,
145
+ "nauc_mrr_at_100_diff1": 0.31747,
146
+ "nauc_mrr_at_1000_max": 0.095563,
147
+ "nauc_mrr_at_1000_std": -0.072916,
148
+ "nauc_mrr_at_1000_diff1": 0.317692,
149
+ "hit_rate_at_1": 0.13625,
150
+ "hit_rate_at_3": 0.25501,
151
+ "hit_rate_at_5": 0.3212,
152
+ "hit_rate_at_10": 0.40989,
153
+ "hit_rate_at_20": 0.49628,
154
+ "hit_rate_at_100": 0.68209,
155
+ "hit_rate_at_1000": 0.88052,
156
+ "main_score": 0.25726,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 9067.453301429749,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/MTOPDomainClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "a76d16fae880597b9c73047b50159220a441cb54",
3
+ "task_name": "MTOPDomainClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.849664,
11
+ "f1": 0.842408,
12
+ "f1_weighted": 0.847871,
13
+ "precision": 0.841648,
14
+ "precision_weighted": 0.853421,
15
+ "recall": 0.850454,
16
+ "recall_weighted": 0.849664,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.860403,
22
+ "f1": 0.86009,
23
+ "f1_weighted": 0.860031,
24
+ "precision": 0.858299,
25
+ "precision_weighted": 0.864566,
26
+ "recall": 0.86699,
27
+ "recall_weighted": 0.860403,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.849217,
33
+ "f1": 0.846257,
34
+ "f1_weighted": 0.848173,
35
+ "precision": 0.841765,
36
+ "precision_weighted": 0.850066,
37
+ "recall": 0.853366,
38
+ "recall_weighted": 0.849217,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.867114,
44
+ "f1": 0.866079,
45
+ "f1_weighted": 0.868196,
46
+ "precision": 0.865395,
47
+ "precision_weighted": 0.879134,
48
+ "recall": 0.875804,
49
+ "recall_weighted": 0.867114,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.856823,
55
+ "f1": 0.853063,
56
+ "f1_weighted": 0.856802,
57
+ "precision": 0.849901,
58
+ "precision_weighted": 0.861268,
59
+ "recall": 0.861489,
60
+ "recall_weighted": 0.856823,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.846085,
66
+ "f1": 0.843752,
67
+ "f1_weighted": 0.843295,
68
+ "precision": 0.838869,
69
+ "precision_weighted": 0.846904,
70
+ "recall": 0.854694,
71
+ "recall_weighted": 0.846085,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.853691,
77
+ "f1": 0.849989,
78
+ "f1_weighted": 0.852459,
79
+ "precision": 0.844715,
80
+ "precision_weighted": 0.85399,
81
+ "recall": 0.857996,
82
+ "recall_weighted": 0.853691,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.859955,
88
+ "f1": 0.853935,
89
+ "f1_weighted": 0.861136,
90
+ "precision": 0.849671,
91
+ "precision_weighted": 0.866908,
92
+ "recall": 0.863457,
93
+ "recall_weighted": 0.859955,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.858613,
99
+ "f1": 0.857104,
100
+ "f1_weighted": 0.858136,
101
+ "precision": 0.853352,
102
+ "precision_weighted": 0.861144,
103
+ "recall": 0.864174,
104
+ "recall_weighted": 0.858613,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.861745,
110
+ "f1": 0.857971,
111
+ "f1_weighted": 0.860317,
112
+ "precision": 0.858812,
113
+ "precision_weighted": 0.862096,
114
+ "recall": 0.860289,
115
+ "recall_weighted": 0.861745,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.856331,
121
+ "f1": 0.853065,
122
+ "f1_weighted": 0.855642,
123
+ "precision": 0.850243,
124
+ "precision_weighted": 0.85995,
125
+ "recall": 0.860871,
126
+ "recall_weighted": 0.856331,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.856331,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.840857,
141
+ "f1": 0.83241,
142
+ "f1_weighted": 0.83953,
143
+ "precision": 0.83038,
144
+ "precision_weighted": 0.846361,
145
+ "recall": 0.842637,
146
+ "recall_weighted": 0.840857,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.846101,
152
+ "f1": 0.840324,
153
+ "f1_weighted": 0.84584,
154
+ "precision": 0.836554,
155
+ "precision_weighted": 0.852249,
156
+ "recall": 0.851363,
157
+ "recall_weighted": 0.846101,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.849977,
163
+ "f1": 0.847455,
164
+ "f1_weighted": 0.849387,
165
+ "precision": 0.840706,
166
+ "precision_weighted": 0.853674,
167
+ "recall": 0.859574,
168
+ "recall_weighted": 0.849977,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.856589,
174
+ "f1": 0.851326,
175
+ "f1_weighted": 0.857063,
176
+ "precision": 0.848602,
177
+ "precision_weighted": 0.869493,
178
+ "recall": 0.864798,
179
+ "recall_weighted": 0.856589,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.852485,
185
+ "f1": 0.846573,
186
+ "f1_weighted": 0.85384,
187
+ "precision": 0.842506,
188
+ "precision_weighted": 0.86308,
189
+ "recall": 0.860203,
190
+ "recall_weighted": 0.852485,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.842909,
196
+ "f1": 0.838147,
197
+ "f1_weighted": 0.841086,
198
+ "precision": 0.829988,
199
+ "precision_weighted": 0.846932,
200
+ "recall": 0.85314,
201
+ "recall_weighted": 0.842909,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.848609,
207
+ "f1": 0.842065,
208
+ "f1_weighted": 0.847148,
209
+ "precision": 0.835135,
210
+ "precision_weighted": 0.850091,
211
+ "recall": 0.853408,
212
+ "recall_weighted": 0.848609,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.859553,
218
+ "f1": 0.84798,
219
+ "f1_weighted": 0.861298,
220
+ "precision": 0.845394,
221
+ "precision_weighted": 0.868938,
222
+ "recall": 0.858955,
223
+ "recall_weighted": 0.859553,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.851345,
229
+ "f1": 0.844526,
230
+ "f1_weighted": 0.8515,
231
+ "precision": 0.838136,
232
+ "precision_weighted": 0.857778,
233
+ "recall": 0.856986,
234
+ "recall_weighted": 0.851345,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.853397,
240
+ "f1": 0.845712,
241
+ "f1_weighted": 0.853477,
242
+ "precision": 0.842757,
243
+ "precision_weighted": 0.858192,
244
+ "recall": 0.853964,
245
+ "recall_weighted": 0.853397,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.850182,
251
+ "f1": 0.843652,
252
+ "f1_weighted": 0.850017,
253
+ "precision": 0.839016,
254
+ "precision_weighted": 0.856679,
255
+ "recall": 0.855503,
256
+ "recall_weighted": 0.850182,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.850182,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 51.17427062988281,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MTOPIntentClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "2992d820f31312593c49a4890430aadadb0f0039",
3
+ "task_name": "MTOPIntentClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.516331,
11
+ "f1": 0.366147,
12
+ "f1_weighted": 0.53159,
13
+ "precision": 0.364296,
14
+ "precision_weighted": 0.763314,
15
+ "recall": 0.572068,
16
+ "recall_weighted": 0.516331,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.558389,
22
+ "f1": 0.370471,
23
+ "f1_weighted": 0.578883,
24
+ "precision": 0.364228,
25
+ "precision_weighted": 0.753886,
26
+ "recall": 0.555104,
27
+ "recall_weighted": 0.558389,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.555257,
33
+ "f1": 0.370618,
34
+ "f1_weighted": 0.581559,
35
+ "precision": 0.357602,
36
+ "precision_weighted": 0.758434,
37
+ "recall": 0.547853,
38
+ "recall_weighted": 0.555257,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.537808,
44
+ "f1": 0.376547,
45
+ "f1_weighted": 0.559899,
46
+ "precision": 0.376016,
47
+ "precision_weighted": 0.769687,
48
+ "recall": 0.569106,
49
+ "recall_weighted": 0.537808,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.560179,
55
+ "f1": 0.396464,
56
+ "f1_weighted": 0.578002,
57
+ "precision": 0.3904,
58
+ "precision_weighted": 0.749415,
59
+ "recall": 0.572465,
60
+ "recall_weighted": 0.560179,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.548993,
66
+ "f1": 0.389363,
67
+ "f1_weighted": 0.566242,
68
+ "precision": 0.382483,
69
+ "precision_weighted": 0.741325,
70
+ "recall": 0.559683,
71
+ "recall_weighted": 0.548993,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.543177,
77
+ "f1": 0.377522,
78
+ "f1_weighted": 0.571187,
79
+ "precision": 0.376821,
80
+ "precision_weighted": 0.785483,
81
+ "recall": 0.552711,
82
+ "recall_weighted": 0.543177,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.567785,
88
+ "f1": 0.382146,
89
+ "f1_weighted": 0.59347,
90
+ "precision": 0.380292,
91
+ "precision_weighted": 0.7728,
92
+ "recall": 0.566462,
93
+ "recall_weighted": 0.567785,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.547204,
99
+ "f1": 0.377268,
100
+ "f1_weighted": 0.567397,
101
+ "precision": 0.373497,
102
+ "precision_weighted": 0.752202,
103
+ "recall": 0.564719,
104
+ "recall_weighted": 0.547204,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.590157,
110
+ "f1": 0.405506,
111
+ "f1_weighted": 0.61183,
112
+ "precision": 0.392285,
113
+ "precision_weighted": 0.771855,
114
+ "recall": 0.577897,
115
+ "recall_weighted": 0.590157,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.552528,
121
+ "f1": 0.381205,
122
+ "f1_weighted": 0.574006,
123
+ "precision": 0.375792,
124
+ "precision_weighted": 0.76184,
125
+ "recall": 0.563807,
126
+ "recall_weighted": 0.552528,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.552528,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.502052,
141
+ "f1": 0.34166,
142
+ "f1_weighted": 0.515926,
143
+ "precision": 0.333103,
144
+ "precision_weighted": 0.746089,
145
+ "recall": 0.570769,
146
+ "recall_weighted": 0.502052,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.545372,
152
+ "f1": 0.371444,
153
+ "f1_weighted": 0.56869,
154
+ "precision": 0.365886,
155
+ "precision_weighted": 0.776895,
156
+ "recall": 0.591148,
157
+ "recall_weighted": 0.545372,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.563839,
163
+ "f1": 0.386771,
164
+ "f1_weighted": 0.590108,
165
+ "precision": 0.371579,
166
+ "precision_weighted": 0.762373,
167
+ "recall": 0.585059,
168
+ "recall_weighted": 0.563839,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.53648,
174
+ "f1": 0.368875,
175
+ "f1_weighted": 0.555746,
176
+ "precision": 0.358243,
177
+ "precision_weighted": 0.758161,
178
+ "recall": 0.582167,
179
+ "recall_weighted": 0.53648,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.540812,
185
+ "f1": 0.37694,
186
+ "f1_weighted": 0.559379,
187
+ "precision": 0.36322,
188
+ "precision_weighted": 0.75342,
189
+ "recall": 0.585729,
190
+ "recall_weighted": 0.540812,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.540128,
196
+ "f1": 0.38034,
197
+ "f1_weighted": 0.55082,
198
+ "precision": 0.35729,
199
+ "precision_weighted": 0.727782,
200
+ "recall": 0.582046,
201
+ "recall_weighted": 0.540128,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.531236,
207
+ "f1": 0.378585,
208
+ "f1_weighted": 0.55746,
209
+ "precision": 0.372597,
210
+ "precision_weighted": 0.76349,
211
+ "recall": 0.583202,
212
+ "recall_weighted": 0.531236,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.55358,
218
+ "f1": 0.38234,
219
+ "f1_weighted": 0.578064,
220
+ "precision": 0.371971,
221
+ "precision_weighted": 0.78122,
222
+ "recall": 0.592112,
223
+ "recall_weighted": 0.55358,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.54788,
229
+ "f1": 0.389745,
230
+ "f1_weighted": 0.569258,
231
+ "precision": 0.375039,
232
+ "precision_weighted": 0.767314,
233
+ "recall": 0.617152,
234
+ "recall_weighted": 0.54788,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.567715,
240
+ "f1": 0.381227,
241
+ "f1_weighted": 0.585695,
242
+ "precision": 0.363215,
243
+ "precision_weighted": 0.75433,
244
+ "recall": 0.587271,
245
+ "recall_weighted": 0.567715,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.542909,
251
+ "f1": 0.375793,
252
+ "f1_weighted": 0.563115,
253
+ "precision": 0.363214,
254
+ "precision_weighted": 0.759107,
255
+ "recall": 0.587666,
256
+ "recall_weighted": 0.542909,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.542909,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 118.27064609527588,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MassiveIntentClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "4672e20407010da34463acc759c162ca9734bca6",
3
+ "task_name": "MassiveIntentClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.604525,
11
+ "f1": 0.564468,
12
+ "f1_weighted": 0.591358,
13
+ "precision": 0.562117,
14
+ "precision_weighted": 0.650289,
15
+ "recall": 0.651761,
16
+ "recall_weighted": 0.604525,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.649287,
22
+ "f1": 0.611558,
23
+ "f1_weighted": 0.642462,
24
+ "precision": 0.596179,
25
+ "precision_weighted": 0.686891,
26
+ "recall": 0.692154,
27
+ "recall_weighted": 0.649287,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.620266,
33
+ "f1": 0.574451,
34
+ "f1_weighted": 0.602432,
35
+ "precision": 0.563445,
36
+ "precision_weighted": 0.637167,
37
+ "recall": 0.669354,
38
+ "recall_weighted": 0.620266,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.625676,
44
+ "f1": 0.575931,
45
+ "f1_weighted": 0.60735,
46
+ "precision": 0.575796,
47
+ "precision_weighted": 0.69068,
48
+ "recall": 0.669621,
49
+ "recall_weighted": 0.625676,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.618298,
55
+ "f1": 0.579576,
56
+ "f1_weighted": 0.602698,
57
+ "precision": 0.566206,
58
+ "precision_weighted": 0.649519,
59
+ "recall": 0.683943,
60
+ "recall_weighted": 0.618298,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.578455,
66
+ "f1": 0.554946,
67
+ "f1_weighted": 0.558671,
68
+ "precision": 0.562892,
69
+ "precision_weighted": 0.65798,
70
+ "recall": 0.656489,
71
+ "recall_weighted": 0.578455,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.605509,
77
+ "f1": 0.576624,
78
+ "f1_weighted": 0.591015,
79
+ "precision": 0.581767,
80
+ "precision_weighted": 0.646599,
81
+ "recall": 0.660639,
82
+ "recall_weighted": 0.605509,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.595671,
88
+ "f1": 0.573045,
89
+ "f1_weighted": 0.574776,
90
+ "precision": 0.575893,
91
+ "precision_weighted": 0.65593,
92
+ "recall": 0.668549,
93
+ "recall_weighted": 0.595671,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.595671,
99
+ "f1": 0.562687,
100
+ "f1_weighted": 0.566757,
101
+ "precision": 0.589109,
102
+ "precision_weighted": 0.69285,
103
+ "recall": 0.662312,
104
+ "recall_weighted": 0.595671,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.631579,
110
+ "f1": 0.590278,
111
+ "f1_weighted": 0.622264,
112
+ "precision": 0.589255,
113
+ "precision_weighted": 0.683462,
114
+ "recall": 0.674086,
115
+ "recall_weighted": 0.631579,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.612494,
121
+ "f1": 0.576356,
122
+ "f1_weighted": 0.595978,
123
+ "precision": 0.576266,
124
+ "precision_weighted": 0.665137,
125
+ "recall": 0.668891,
126
+ "recall_weighted": 0.612494,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.612494,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.611634,
141
+ "f1": 0.585785,
142
+ "f1_weighted": 0.599458,
143
+ "precision": 0.581867,
144
+ "precision_weighted": 0.668286,
145
+ "recall": 0.689783,
146
+ "recall_weighted": 0.611634,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.648285,
152
+ "f1": 0.625164,
153
+ "f1_weighted": 0.640539,
154
+ "precision": 0.60389,
155
+ "precision_weighted": 0.685165,
156
+ "recall": 0.704445,
157
+ "recall_weighted": 0.648285,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.605918,
163
+ "f1": 0.577655,
164
+ "f1_weighted": 0.592666,
165
+ "precision": 0.569684,
166
+ "precision_weighted": 0.65938,
167
+ "recall": 0.667719,
168
+ "recall_weighted": 0.605918,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.630464,
174
+ "f1": 0.597327,
175
+ "f1_weighted": 0.613853,
176
+ "precision": 0.584902,
177
+ "precision_weighted": 0.671286,
178
+ "recall": 0.68058,
179
+ "recall_weighted": 0.630464,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.603564,
185
+ "f1": 0.582215,
186
+ "f1_weighted": 0.590151,
187
+ "precision": 0.580583,
188
+ "precision_weighted": 0.65609,
189
+ "recall": 0.687555,
190
+ "recall_weighted": 0.603564,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.566241,
196
+ "f1": 0.560964,
197
+ "f1_weighted": 0.55125,
198
+ "precision": 0.567787,
199
+ "precision_weighted": 0.668481,
200
+ "recall": 0.654699,
201
+ "recall_weighted": 0.566241,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.6039,
207
+ "f1": 0.583179,
208
+ "f1_weighted": 0.586948,
209
+ "precision": 0.572859,
210
+ "precision_weighted": 0.650546,
211
+ "recall": 0.694353,
212
+ "recall_weighted": 0.6039,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.597512,
218
+ "f1": 0.590263,
219
+ "f1_weighted": 0.579391,
220
+ "precision": 0.580378,
221
+ "precision_weighted": 0.652295,
222
+ "recall": 0.680761,
223
+ "recall_weighted": 0.597512,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.591459,
229
+ "f1": 0.581459,
230
+ "f1_weighted": 0.562673,
231
+ "precision": 0.591208,
232
+ "precision_weighted": 0.651962,
233
+ "recall": 0.683426,
234
+ "recall_weighted": 0.591459,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.624412,
240
+ "f1": 0.607599,
241
+ "f1_weighted": 0.616047,
242
+ "precision": 0.597789,
243
+ "precision_weighted": 0.679811,
244
+ "recall": 0.707667,
245
+ "recall_weighted": 0.624412,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.608339,
251
+ "f1": 0.589161,
252
+ "f1_weighted": 0.593298,
253
+ "precision": 0.583095,
254
+ "precision_weighted": 0.66433,
255
+ "recall": 0.685099,
256
+ "recall_weighted": 0.608339,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.608339,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 93.8818793296814,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MassiveScenarioClassification.json ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "fad2c6e8459f9e1c45d9315f4953d921437d70f8",
3
+ "task_name": "MassiveScenarioClassification",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "validation": [
7
+ {
8
+ "scores_per_experiment": [
9
+ {
10
+ "accuracy": 0.71274,
11
+ "f1": 0.697544,
12
+ "f1_weighted": 0.711116,
13
+ "precision": 0.674548,
14
+ "precision_weighted": 0.741975,
15
+ "recall": 0.759949,
16
+ "recall_weighted": 0.71274,
17
+ "ap": null,
18
+ "ap_weighted": null
19
+ },
20
+ {
21
+ "accuracy": 0.709788,
22
+ "f1": 0.693885,
23
+ "f1_weighted": 0.710241,
24
+ "precision": 0.67634,
25
+ "precision_weighted": 0.752904,
26
+ "recall": 0.758393,
27
+ "recall_weighted": 0.709788,
28
+ "ap": null,
29
+ "ap_weighted": null
30
+ },
31
+ {
32
+ "accuracy": 0.705853,
33
+ "f1": 0.686789,
34
+ "f1_weighted": 0.707665,
35
+ "precision": 0.670715,
36
+ "precision_weighted": 0.744714,
37
+ "recall": 0.746243,
38
+ "recall_weighted": 0.705853,
39
+ "ap": null,
40
+ "ap_weighted": null
41
+ },
42
+ {
43
+ "accuracy": 0.706345,
44
+ "f1": 0.695964,
45
+ "f1_weighted": 0.706199,
46
+ "precision": 0.683217,
47
+ "precision_weighted": 0.74656,
48
+ "recall": 0.758882,
49
+ "recall_weighted": 0.706345,
50
+ "ap": null,
51
+ "ap_weighted": null
52
+ },
53
+ {
54
+ "accuracy": 0.689129,
55
+ "f1": 0.663394,
56
+ "f1_weighted": 0.683663,
57
+ "precision": 0.654199,
58
+ "precision_weighted": 0.731903,
59
+ "recall": 0.730987,
60
+ "recall_weighted": 0.689129,
61
+ "ap": null,
62
+ "ap_weighted": null
63
+ },
64
+ {
65
+ "accuracy": 0.652238,
66
+ "f1": 0.64493,
67
+ "f1_weighted": 0.649579,
68
+ "precision": 0.645171,
69
+ "precision_weighted": 0.73168,
70
+ "recall": 0.720297,
71
+ "recall_weighted": 0.652238,
72
+ "ap": null,
73
+ "ap_weighted": null
74
+ },
75
+ {
76
+ "accuracy": 0.679292,
77
+ "f1": 0.667111,
78
+ "f1_weighted": 0.678738,
79
+ "precision": 0.655266,
80
+ "precision_weighted": 0.724571,
81
+ "recall": 0.729045,
82
+ "recall_weighted": 0.679292,
83
+ "ap": null,
84
+ "ap_weighted": null
85
+ },
86
+ {
87
+ "accuracy": 0.700443,
88
+ "f1": 0.684198,
89
+ "f1_weighted": 0.701108,
90
+ "precision": 0.667236,
91
+ "precision_weighted": 0.73053,
92
+ "recall": 0.736963,
93
+ "recall_weighted": 0.700443,
94
+ "ap": null,
95
+ "ap_weighted": null
96
+ },
97
+ {
98
+ "accuracy": 0.713232,
99
+ "f1": 0.701666,
100
+ "f1_weighted": 0.711765,
101
+ "precision": 0.694783,
102
+ "precision_weighted": 0.752493,
103
+ "recall": 0.75776,
104
+ "recall_weighted": 0.713232,
105
+ "ap": null,
106
+ "ap_weighted": null
107
+ },
108
+ {
109
+ "accuracy": 0.684702,
110
+ "f1": 0.672365,
111
+ "f1_weighted": 0.681204,
112
+ "precision": 0.660414,
113
+ "precision_weighted": 0.726153,
114
+ "recall": 0.74038,
115
+ "recall_weighted": 0.684702,
116
+ "ap": null,
117
+ "ap_weighted": null
118
+ }
119
+ ],
120
+ "accuracy": 0.695376,
121
+ "f1": 0.680785,
122
+ "f1_weighted": 0.694128,
123
+ "precision": 0.668189,
124
+ "precision_weighted": 0.738348,
125
+ "recall": 0.74389,
126
+ "recall_weighted": 0.695376,
127
+ "ap": NaN,
128
+ "ap_weighted": NaN,
129
+ "main_score": 0.695376,
130
+ "hf_subset": "en",
131
+ "languages": [
132
+ "eng-Latn"
133
+ ]
134
+ }
135
+ ],
136
+ "test": [
137
+ {
138
+ "scores_per_experiment": [
139
+ {
140
+ "accuracy": 0.712172,
141
+ "f1": 0.703247,
142
+ "f1_weighted": 0.710795,
143
+ "precision": 0.683117,
144
+ "precision_weighted": 0.738134,
145
+ "recall": 0.758053,
146
+ "recall_weighted": 0.712172,
147
+ "ap": null,
148
+ "ap_weighted": null
149
+ },
150
+ {
151
+ "accuracy": 0.709482,
152
+ "f1": 0.694076,
153
+ "f1_weighted": 0.706871,
154
+ "precision": 0.673087,
155
+ "precision_weighted": 0.738632,
156
+ "recall": 0.752876,
157
+ "recall_weighted": 0.709482,
158
+ "ap": null,
159
+ "ap_weighted": null
160
+ },
161
+ {
162
+ "accuracy": 0.70881,
163
+ "f1": 0.692114,
164
+ "f1_weighted": 0.707291,
165
+ "precision": 0.67241,
166
+ "precision_weighted": 0.736,
167
+ "recall": 0.749129,
168
+ "recall_weighted": 0.70881,
169
+ "ap": null,
170
+ "ap_weighted": null
171
+ },
172
+ {
173
+ "accuracy": 0.70074,
174
+ "f1": 0.688515,
175
+ "f1_weighted": 0.701795,
176
+ "precision": 0.674293,
177
+ "precision_weighted": 0.737378,
178
+ "recall": 0.7451,
179
+ "recall_weighted": 0.70074,
180
+ "ap": null,
181
+ "ap_weighted": null
182
+ },
183
+ {
184
+ "accuracy": 0.687626,
185
+ "f1": 0.670019,
186
+ "f1_weighted": 0.682471,
187
+ "precision": 0.658161,
188
+ "precision_weighted": 0.724946,
189
+ "recall": 0.731061,
190
+ "recall_weighted": 0.687626,
191
+ "ap": null,
192
+ "ap_weighted": null
193
+ },
194
+ {
195
+ "accuracy": 0.665098,
196
+ "f1": 0.655494,
197
+ "f1_weighted": 0.663884,
198
+ "precision": 0.652821,
199
+ "precision_weighted": 0.73089,
200
+ "recall": 0.722243,
201
+ "recall_weighted": 0.665098,
202
+ "ap": null,
203
+ "ap_weighted": null
204
+ },
205
+ {
206
+ "accuracy": 0.693006,
207
+ "f1": 0.6792,
208
+ "f1_weighted": 0.694351,
209
+ "precision": 0.668179,
210
+ "precision_weighted": 0.740254,
211
+ "recall": 0.739231,
212
+ "recall_weighted": 0.693006,
213
+ "ap": null,
214
+ "ap_weighted": null
215
+ },
216
+ {
217
+ "accuracy": 0.696369,
218
+ "f1": 0.686446,
219
+ "f1_weighted": 0.698406,
220
+ "precision": 0.672031,
221
+ "precision_weighted": 0.732475,
222
+ "recall": 0.738152,
223
+ "recall_weighted": 0.696369,
224
+ "ap": null,
225
+ "ap_weighted": null
226
+ },
227
+ {
228
+ "accuracy": 0.69805,
229
+ "f1": 0.6874,
230
+ "f1_weighted": 0.695055,
231
+ "precision": 0.678027,
232
+ "precision_weighted": 0.735196,
233
+ "recall": 0.743339,
234
+ "recall_weighted": 0.69805,
235
+ "ap": null,
236
+ "ap_weighted": null
237
+ },
238
+ {
239
+ "accuracy": 0.671486,
240
+ "f1": 0.659812,
241
+ "f1_weighted": 0.668703,
242
+ "precision": 0.648421,
243
+ "precision_weighted": 0.71395,
244
+ "recall": 0.726887,
245
+ "recall_weighted": 0.671486,
246
+ "ap": null,
247
+ "ap_weighted": null
248
+ }
249
+ ],
250
+ "accuracy": 0.694284,
251
+ "f1": 0.681632,
252
+ "f1_weighted": 0.692962,
253
+ "precision": 0.668055,
254
+ "precision_weighted": 0.732785,
255
+ "recall": 0.740607,
256
+ "recall_weighted": 0.694284,
257
+ "ap": NaN,
258
+ "ap_weighted": NaN,
259
+ "main_score": 0.694284,
260
+ "hf_subset": "en",
261
+ "languages": [
262
+ "eng-Latn"
263
+ ]
264
+ }
265
+ ]
266
+ },
267
+ "evaluation_time": 37.04043388366699,
268
+ "kg_co2_emissions": null,
269
+ "date": null
270
+ }
results/MedrxivClusteringP2P.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "e7a26af6f3ae46b30dde8737f02c07b1505bcc73",
3
+ "task_name": "MedrxivClusteringP2P",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.308807,
9
+ "v_measure_std": 0.013837,
10
+ "v_measures": [
11
+ 0.29951,
12
+ 0.30597,
13
+ 0.28839,
14
+ 0.289771,
15
+ 0.297263,
16
+ 0.321266,
17
+ 0.329566,
18
+ 0.316215,
19
+ 0.322662,
20
+ 0.317455
21
+ ],
22
+ "main_score": 0.308807,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 49.264570474624634,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MedrxivClusteringS2S.json ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "35191c8c0dca72d8ff3efcd72aa802307d469663",
3
+ "task_name": "MedrxivClusteringS2S",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "v_measure": 0.253509,
9
+ "v_measure_std": 0.012472,
10
+ "v_measures": [
11
+ 0.249246,
12
+ 0.235379,
13
+ 0.247575,
14
+ 0.244391,
15
+ 0.239675,
16
+ 0.270499,
17
+ 0.249992,
18
+ 0.259221,
19
+ 0.274573,
20
+ 0.264539
21
+ ],
22
+ "main_score": 0.253509,
23
+ "hf_subset": "default",
24
+ "languages": [
25
+ "eng-Latn"
26
+ ]
27
+ }
28
+ ]
29
+ },
30
+ "evaluation_time": 39.75427198410034,
31
+ "kg_co2_emissions": null,
32
+ "date": null
33
+ }
results/MindSmallReranking.json ADDED
@@ -0,0 +1,252 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "227478e3235572039f4f7661840e059f31ef6eb1",
3
+ "task_name": "MindSmallReranking",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.12598,
9
+ "ndcg_at_3": 0.20216,
10
+ "ndcg_at_5": 0.24835,
11
+ "ndcg_at_10": 0.31204,
12
+ "ndcg_at_20": 0.36539,
13
+ "ndcg_at_100": 0.42798,
14
+ "ndcg_at_1000": 0.43128,
15
+ "map_at_1": 0.0956,
16
+ "map_at_3": 0.16528,
17
+ "map_at_5": 0.19258,
18
+ "map_at_10": 0.22112,
19
+ "map_at_20": 0.23859,
20
+ "map_at_100": 0.25209,
21
+ "map_at_1000": 0.25252,
22
+ "recall_at_1": 0.0956,
23
+ "recall_at_3": 0.25232,
24
+ "recall_at_5": 0.36225,
25
+ "recall_at_10": 0.54217,
26
+ "recall_at_20": 0.72363,
27
+ "recall_at_100": 0.98401,
28
+ "recall_at_1000": 1.0,
29
+ "accuracy": 0.0956,
30
+ "precision_at_1": 0.12598,
31
+ "precision_at_3": 0.11221,
32
+ "precision_at_5": 0.09928,
33
+ "precision_at_10": 0.07839,
34
+ "precision_at_20": 0.05605,
35
+ "precision_at_100": 0.01768,
36
+ "precision_at_1000": 0.00183,
37
+ "mrr_at_1": 0.125981,
38
+ "mrr_at_3": 0.209588,
39
+ "mrr_at_5": 0.239351,
40
+ "mrr_at_10": 0.26572,
41
+ "mrr_at_20": 0.277943,
42
+ "mrr_at_100": 0.283183,
43
+ "mrr_at_1000": 0.283225,
44
+ "nauc_ndcg_at_1_max": -0.104201,
45
+ "nauc_ndcg_at_1_std": 0.001288,
46
+ "nauc_ndcg_at_1_diff1": 0.095348,
47
+ "nauc_ndcg_at_3_max": -0.211544,
48
+ "nauc_ndcg_at_3_std": -0.029752,
49
+ "nauc_ndcg_at_3_diff1": 0.115933,
50
+ "nauc_ndcg_at_5_max": -0.244515,
51
+ "nauc_ndcg_at_5_std": -0.031516,
52
+ "nauc_ndcg_at_5_diff1": 0.114488,
53
+ "nauc_ndcg_at_10_max": -0.279164,
54
+ "nauc_ndcg_at_10_std": -0.029658,
55
+ "nauc_ndcg_at_10_diff1": 0.110381,
56
+ "nauc_ndcg_at_20_max": -0.287885,
57
+ "nauc_ndcg_at_20_std": -0.022378,
58
+ "nauc_ndcg_at_20_diff1": 0.108318,
59
+ "nauc_ndcg_at_100_max": -0.225422,
60
+ "nauc_ndcg_at_100_std": -0.017316,
61
+ "nauc_ndcg_at_100_diff1": 0.105675,
62
+ "nauc_ndcg_at_1000_max": -0.214328,
63
+ "nauc_ndcg_at_1000_std": -0.018515,
64
+ "nauc_ndcg_at_1000_diff1": 0.105085,
65
+ "nauc_map_at_1_max": -0.173303,
66
+ "nauc_map_at_1_std": -0.025778,
67
+ "nauc_map_at_1_diff1": 0.113225,
68
+ "nauc_map_at_3_max": -0.225168,
69
+ "nauc_map_at_3_std": -0.036324,
70
+ "nauc_map_at_3_diff1": 0.121841,
71
+ "nauc_map_at_5_max": -0.241531,
72
+ "nauc_map_at_5_std": -0.035501,
73
+ "nauc_map_at_5_diff1": 0.119487,
74
+ "nauc_map_at_10_max": -0.25586,
75
+ "nauc_map_at_10_std": -0.033091,
76
+ "nauc_map_at_10_diff1": 0.116565,
77
+ "nauc_map_at_20_max": -0.256827,
78
+ "nauc_map_at_20_std": -0.029715,
79
+ "nauc_map_at_20_diff1": 0.115319,
80
+ "nauc_map_at_100_max": -0.243816,
81
+ "nauc_map_at_100_std": -0.027839,
82
+ "nauc_map_at_100_diff1": 0.114594,
83
+ "nauc_map_at_1000_max": -0.242651,
84
+ "nauc_map_at_1000_std": -0.027962,
85
+ "nauc_map_at_1000_diff1": 0.114543,
86
+ "nauc_recall_at_1_max": -0.173303,
87
+ "nauc_recall_at_1_std": -0.025778,
88
+ "nauc_recall_at_1_diff1": 0.113225,
89
+ "nauc_recall_at_3_max": -0.252014,
90
+ "nauc_recall_at_3_std": -0.042615,
91
+ "nauc_recall_at_3_diff1": 0.118464,
92
+ "nauc_recall_at_5_max": -0.294924,
93
+ "nauc_recall_at_5_std": -0.042843,
94
+ "nauc_recall_at_5_diff1": 0.111123,
95
+ "nauc_recall_at_10_max": -0.377514,
96
+ "nauc_recall_at_10_std": -0.041452,
97
+ "nauc_recall_at_10_diff1": 0.100685,
98
+ "nauc_recall_at_20_max": -0.468779,
99
+ "nauc_recall_at_20_std": -0.025678,
100
+ "nauc_recall_at_20_diff1": 0.095301,
101
+ "nauc_recall_at_100_max": -0.788366,
102
+ "nauc_recall_at_100_std": 0.06123,
103
+ "nauc_recall_at_100_diff1": 0.107095,
104
+ "nauc_recall_at_1000_max": -0.523156,
105
+ "nauc_recall_at_1000_std": -0.097666,
106
+ "nauc_recall_at_1000_diff1": 0.462278,
107
+ "nauc_precision_at_1_max": -0.104201,
108
+ "nauc_precision_at_1_std": 0.001288,
109
+ "nauc_precision_at_1_diff1": 0.095348,
110
+ "nauc_precision_at_3_max": -0.170536,
111
+ "nauc_precision_at_3_std": -0.007155,
112
+ "nauc_precision_at_3_diff1": 0.095646,
113
+ "nauc_precision_at_5_max": -0.186608,
114
+ "nauc_precision_at_5_std": 0.001505,
115
+ "nauc_precision_at_5_diff1": 0.081227,
116
+ "nauc_precision_at_10_max": -0.162808,
117
+ "nauc_precision_at_10_std": 0.02156,
118
+ "nauc_precision_at_10_diff1": 0.045084,
119
+ "nauc_precision_at_20_max": -0.051641,
120
+ "nauc_precision_at_20_std": 0.049477,
121
+ "nauc_precision_at_20_diff1": 0.002871,
122
+ "nauc_precision_at_100_max": 0.248386,
123
+ "nauc_precision_at_100_std": 0.049607,
124
+ "nauc_precision_at_100_diff1": -0.0518,
125
+ "nauc_precision_at_1000_max": 0.276931,
126
+ "nauc_precision_at_1000_std": 0.043409,
127
+ "nauc_precision_at_1000_diff1": -0.054216,
128
+ "nauc_mrr_at_1_max": -0.104201,
129
+ "nauc_mrr_at_1_std": 0.001288,
130
+ "nauc_mrr_at_1_diff1": 0.095348,
131
+ "nauc_mrr_at_3_max": -0.152842,
132
+ "nauc_mrr_at_3_std": -0.010081,
133
+ "nauc_mrr_at_3_diff1": 0.10128,
134
+ "nauc_mrr_at_5_max": -0.16673,
135
+ "nauc_mrr_at_5_std": -0.010453,
136
+ "nauc_mrr_at_5_diff1": 0.100154,
137
+ "nauc_mrr_at_10_max": -0.176941,
138
+ "nauc_mrr_at_10_std": -0.010261,
139
+ "nauc_mrr_at_10_diff1": 0.099085,
140
+ "nauc_mrr_at_20_max": -0.177121,
141
+ "nauc_mrr_at_20_std": -0.009398,
142
+ "nauc_mrr_at_20_diff1": 0.099188,
143
+ "nauc_mrr_at_100_max": -0.173364,
144
+ "nauc_mrr_at_100_std": -0.009322,
145
+ "nauc_mrr_at_100_diff1": 0.099437,
146
+ "nauc_mrr_at_1000_max": -0.173261,
147
+ "nauc_mrr_at_1000_std": -0.009337,
148
+ "nauc_mrr_at_1000_diff1": 0.099435,
149
+ "hit_rate_at_1": 0.12598,
150
+ "hit_rate_at_3": 0.32039,
151
+ "hit_rate_at_5": 0.4515,
152
+ "hit_rate_at_10": 0.6489,
153
+ "hit_rate_at_20": 0.82237,
154
+ "hit_rate_at_100": 0.99467,
155
+ "hit_rate_at_1000": 1.0,
156
+ "max_over_subqueries_ndcg_at_1": 0.15657,
157
+ "max_over_subqueries_ndcg_at_3": 0.25225,
158
+ "max_over_subqueries_ndcg_at_5": 0.30198,
159
+ "max_over_subqueries_ndcg_at_10": 0.36553,
160
+ "max_over_subqueries_ndcg_at_20": 0.41292,
161
+ "max_over_subqueries_ndcg_at_100": 0.45995,
162
+ "max_over_subqueries_ndcg_at_1000": 0.46184,
163
+ "max_over_subqueries_map_at_1": 0.12983,
164
+ "max_over_subqueries_map_at_3": 0.21415,
165
+ "max_over_subqueries_map_at_5": 0.24316,
166
+ "max_over_subqueries_map_at_10": 0.27155,
167
+ "max_over_subqueries_map_at_20": 0.28685,
168
+ "max_over_subqueries_map_at_100": 0.29655,
169
+ "max_over_subqueries_map_at_1000": 0.29675,
170
+ "max_over_subqueries_recall_at_1": 0.12983,
171
+ "max_over_subqueries_recall_at_3": 0.31863,
172
+ "max_over_subqueries_recall_at_5": 0.43604,
173
+ "max_over_subqueries_recall_at_10": 0.61794,
174
+ "max_over_subqueries_recall_at_20": 0.78406,
175
+ "max_over_subqueries_recall_at_100": 0.98996,
176
+ "max_over_subqueries_recall_at_1000": 0.99999,
177
+ "max_over_subqueries_accuracy": 0.12983,
178
+ "max_over_subqueries_precision_at_1": 0.15657,
179
+ "max_over_subqueries_precision_at_3": 0.12987,
180
+ "max_over_subqueries_precision_at_5": 0.10928,
181
+ "max_over_subqueries_precision_at_10": 0.08076,
182
+ "max_over_subqueries_precision_at_20": 0.05395,
183
+ "max_over_subqueries_precision_at_100": 0.01494,
184
+ "max_over_subqueries_precision_at_1000": 0.00152,
185
+ "max_over_subqueries_mrr_at_1_max": -0.083322,
186
+ "max_over_subqueries_mrr_at_1_std": 0.004621,
187
+ "max_over_subqueries_mrr_at_1_diff1": 0.129126,
188
+ "max_over_subqueries_mrr_at_3_max": -0.165724,
189
+ "max_over_subqueries_mrr_at_3_std": -0.036559,
190
+ "max_over_subqueries_mrr_at_3_diff1": 0.108393,
191
+ "max_over_subqueries_mrr_at_5_max": -0.174057,
192
+ "max_over_subqueries_mrr_at_5_std": -0.031077,
193
+ "max_over_subqueries_mrr_at_5_diff1": 0.083845,
194
+ "max_over_subqueries_mrr_at_10_max": -0.128178,
195
+ "max_over_subqueries_mrr_at_10_std": -0.003154,
196
+ "max_over_subqueries_mrr_at_10_diff1": 0.045313,
197
+ "max_over_subqueries_mrr_at_20_max": 0.026749,
198
+ "max_over_subqueries_mrr_at_20_std": 0.070114,
199
+ "max_over_subqueries_mrr_at_20_diff1": 0.007007,
200
+ "max_over_subqueries_mrr_at_100_max": 0.37139,
201
+ "max_over_subqueries_mrr_at_100_std": 0.179291,
202
+ "max_over_subqueries_mrr_at_100_diff1": -0.035877,
203
+ "max_over_subqueries_mrr_at_1000_max": 0.394305,
204
+ "max_over_subqueries_mrr_at_1000_std": 0.177673,
205
+ "max_over_subqueries_mrr_at_1000_diff1": -0.037114,
206
+ "max_over_subqueries_mrr_at_1": 0.156573,
207
+ "max_over_subqueries_mrr_at_3": 0.251198,
208
+ "max_over_subqueries_mrr_at_5": 0.281036,
209
+ "max_over_subqueries_mrr_at_10": 0.306447,
210
+ "max_over_subqueries_mrr_at_20": 0.317359,
211
+ "max_over_subqueries_mrr_at_100": 0.321859,
212
+ "max_over_subqueries_mrr_at_1000": 0.321894,
213
+ "max_over_subqueries_nauc_mrr_at_1_max": -0.083322,
214
+ "max_over_subqueries_nauc_mrr_at_1_std": 0.004621,
215
+ "max_over_subqueries_nauc_mrr_at_1_diff1": 0.129126,
216
+ "max_over_subqueries_nauc_mrr_at_3_max": -0.14095,
217
+ "max_over_subqueries_nauc_mrr_at_3_std": -0.026159,
218
+ "max_over_subqueries_nauc_mrr_at_3_diff1": 0.122392,
219
+ "max_over_subqueries_nauc_mrr_at_5_max": -0.15343,
220
+ "max_over_subqueries_nauc_mrr_at_5_std": -0.02986,
221
+ "max_over_subqueries_nauc_mrr_at_5_diff1": 0.118007,
222
+ "max_over_subqueries_nauc_mrr_at_10_max": -0.160839,
223
+ "max_over_subqueries_nauc_mrr_at_10_std": -0.031598,
224
+ "max_over_subqueries_nauc_mrr_at_10_diff1": 0.117091,
225
+ "max_over_subqueries_nauc_mrr_at_20_max": -0.15927,
226
+ "max_over_subqueries_nauc_mrr_at_20_std": -0.02991,
227
+ "max_over_subqueries_nauc_mrr_at_20_diff1": 0.118052,
228
+ "max_over_subqueries_nauc_mrr_at_100_max": -0.156134,
229
+ "max_over_subqueries_nauc_mrr_at_100_std": -0.028925,
230
+ "max_over_subqueries_nauc_mrr_at_100_diff1": 0.118473,
231
+ "max_over_subqueries_nauc_mrr_at_1000_max": -0.156066,
232
+ "max_over_subqueries_nauc_mrr_at_1000_std": -0.028929,
233
+ "max_over_subqueries_nauc_mrr_at_1000_diff1": 0.11848,
234
+ "max_over_subqueries_hit_rate_at_1": 0.15657,
235
+ "max_over_subqueries_hit_rate_at_3": 0.37454,
236
+ "max_over_subqueries_hit_rate_at_5": 0.50564,
237
+ "max_over_subqueries_hit_rate_at_10": 0.69456,
238
+ "max_over_subqueries_hit_rate_at_20": 0.84867,
239
+ "max_over_subqueries_hit_rate_at_100": 0.9956,
240
+ "max_over_subqueries_hit_rate_at_1000": 0.99999,
241
+ "main_score": 0.29675,
242
+ "hf_subset": "default",
243
+ "languages": [
244
+ "eng-Latn"
245
+ ]
246
+ }
247
+ ]
248
+ },
249
+ "evaluation_time": 3376.5412323474884,
250
+ "kg_co2_emissions": null,
251
+ "date": null
252
+ }
results/NFCorpus.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "ec0fa4fe99da2ff19ca1214b7966684033a58814",
3
+ "task_name": "NFCorpus",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.33901,
9
+ "ndcg_at_3": 0.28783,
10
+ "ndcg_at_5": 0.27909,
11
+ "ndcg_at_10": 0.25542,
12
+ "ndcg_at_20": 0.23886,
13
+ "ndcg_at_100": 0.23706,
14
+ "ndcg_at_1000": 0.32831,
15
+ "map_at_1": 0.03476,
16
+ "map_at_3": 0.05512,
17
+ "map_at_5": 0.06771,
18
+ "map_at_10": 0.08141,
19
+ "map_at_20": 0.09025,
20
+ "map_at_100": 0.1047,
21
+ "map_at_1000": 0.11766,
22
+ "recall_at_1": 0.03476,
23
+ "recall_at_3": 0.06453,
24
+ "recall_at_5": 0.09089,
25
+ "recall_at_10": 0.11996,
26
+ "recall_at_20": 0.15138,
27
+ "recall_at_100": 0.24539,
28
+ "recall_at_1000": 0.57056,
29
+ "accuracy": 0.03476,
30
+ "precision_at_1": 0.35294,
31
+ "precision_at_3": 0.26729,
32
+ "precision_at_5": 0.24396,
33
+ "precision_at_10": 0.19412,
34
+ "precision_at_20": 0.14768,
35
+ "precision_at_100": 0.06731,
36
+ "precision_at_1000": 0.01966,
37
+ "mrr_at_1": 0.352941,
38
+ "mrr_at_3": 0.411765,
39
+ "mrr_at_5": 0.429876,
40
+ "mrr_at_10": 0.436587,
41
+ "mrr_at_20": 0.441403,
42
+ "mrr_at_100": 0.444226,
43
+ "mrr_at_1000": 0.444762,
44
+ "nauc_ndcg_at_1_max": 0.473018,
45
+ "nauc_ndcg_at_1_std": 0.23826,
46
+ "nauc_ndcg_at_1_diff1": 0.363702,
47
+ "nauc_ndcg_at_3_max": 0.446305,
48
+ "nauc_ndcg_at_3_std": 0.272751,
49
+ "nauc_ndcg_at_3_diff1": 0.237897,
50
+ "nauc_ndcg_at_5_max": 0.451779,
51
+ "nauc_ndcg_at_5_std": 0.29229,
52
+ "nauc_ndcg_at_5_diff1": 0.221606,
53
+ "nauc_ndcg_at_10_max": 0.455615,
54
+ "nauc_ndcg_at_10_std": 0.307616,
55
+ "nauc_ndcg_at_10_diff1": 0.212164,
56
+ "nauc_ndcg_at_20_max": 0.447208,
57
+ "nauc_ndcg_at_20_std": 0.312728,
58
+ "nauc_ndcg_at_20_diff1": 0.230049,
59
+ "nauc_ndcg_at_100_max": 0.456691,
60
+ "nauc_ndcg_at_100_std": 0.341588,
61
+ "nauc_ndcg_at_100_diff1": 0.25517,
62
+ "nauc_ndcg_at_1000_max": 0.497793,
63
+ "nauc_ndcg_at_1000_std": 0.399852,
64
+ "nauc_ndcg_at_1000_diff1": 0.258261,
65
+ "nauc_map_at_1_max": 0.135152,
66
+ "nauc_map_at_1_std": -0.107361,
67
+ "nauc_map_at_1_diff1": 0.508121,
68
+ "nauc_map_at_3_max": 0.158681,
69
+ "nauc_map_at_3_std": -0.043871,
70
+ "nauc_map_at_3_diff1": 0.361052,
71
+ "nauc_map_at_5_max": 0.197573,
72
+ "nauc_map_at_5_std": -0.005074,
73
+ "nauc_map_at_5_diff1": 0.331139,
74
+ "nauc_map_at_10_max": 0.249288,
75
+ "nauc_map_at_10_std": 0.041982,
76
+ "nauc_map_at_10_diff1": 0.302192,
77
+ "nauc_map_at_20_max": 0.274278,
78
+ "nauc_map_at_20_std": 0.07899,
79
+ "nauc_map_at_20_diff1": 0.294492,
80
+ "nauc_map_at_100_max": 0.322589,
81
+ "nauc_map_at_100_std": 0.157334,
82
+ "nauc_map_at_100_diff1": 0.275475,
83
+ "nauc_map_at_1000_max": 0.346032,
84
+ "nauc_map_at_1000_std": 0.200471,
85
+ "nauc_map_at_1000_diff1": 0.266516,
86
+ "nauc_recall_at_1_max": 0.135152,
87
+ "nauc_recall_at_1_std": -0.107361,
88
+ "nauc_recall_at_1_diff1": 0.508121,
89
+ "nauc_recall_at_3_max": 0.118962,
90
+ "nauc_recall_at_3_std": -0.030595,
91
+ "nauc_recall_at_3_diff1": 0.259935,
92
+ "nauc_recall_at_5_max": 0.17374,
93
+ "nauc_recall_at_5_std": 0.022343,
94
+ "nauc_recall_at_5_diff1": 0.224317,
95
+ "nauc_recall_at_10_max": 0.227353,
96
+ "nauc_recall_at_10_std": 0.075131,
97
+ "nauc_recall_at_10_diff1": 0.200031,
98
+ "nauc_recall_at_20_max": 0.246794,
99
+ "nauc_recall_at_20_std": 0.114776,
100
+ "nauc_recall_at_20_diff1": 0.200156,
101
+ "nauc_recall_at_100_max": 0.295081,
102
+ "nauc_recall_at_100_std": 0.264133,
103
+ "nauc_recall_at_100_diff1": 0.173237,
104
+ "nauc_recall_at_1000_max": 0.269491,
105
+ "nauc_recall_at_1000_std": 0.296777,
106
+ "nauc_recall_at_1000_diff1": 0.09206,
107
+ "nauc_precision_at_1_max": 0.48847,
108
+ "nauc_precision_at_1_std": 0.242548,
109
+ "nauc_precision_at_1_diff1": 0.387197,
110
+ "nauc_precision_at_3_max": 0.451814,
111
+ "nauc_precision_at_3_std": 0.312406,
112
+ "nauc_precision_at_3_diff1": 0.176404,
113
+ "nauc_precision_at_5_max": 0.459187,
114
+ "nauc_precision_at_5_std": 0.356408,
115
+ "nauc_precision_at_5_diff1": 0.131303,
116
+ "nauc_precision_at_10_max": 0.463467,
117
+ "nauc_precision_at_10_std": 0.39319,
118
+ "nauc_precision_at_10_diff1": 0.097895,
119
+ "nauc_precision_at_20_max": 0.448171,
120
+ "nauc_precision_at_20_std": 0.440506,
121
+ "nauc_precision_at_20_diff1": 0.069694,
122
+ "nauc_precision_at_100_max": 0.415382,
123
+ "nauc_precision_at_100_std": 0.508307,
124
+ "nauc_precision_at_100_diff1": 0.010946,
125
+ "nauc_precision_at_1000_max": 0.321652,
126
+ "nauc_precision_at_1000_std": 0.414879,
127
+ "nauc_precision_at_1000_diff1": 0.024123,
128
+ "nauc_mrr_at_1_max": 0.48847,
129
+ "nauc_mrr_at_1_std": 0.242548,
130
+ "nauc_mrr_at_1_diff1": 0.387197,
131
+ "nauc_mrr_at_3_max": 0.49905,
132
+ "nauc_mrr_at_3_std": 0.283516,
133
+ "nauc_mrr_at_3_diff1": 0.340901,
134
+ "nauc_mrr_at_5_max": 0.512803,
135
+ "nauc_mrr_at_5_std": 0.298777,
136
+ "nauc_mrr_at_5_diff1": 0.35082,
137
+ "nauc_mrr_at_10_max": 0.511988,
138
+ "nauc_mrr_at_10_std": 0.299783,
139
+ "nauc_mrr_at_10_diff1": 0.353306,
140
+ "nauc_mrr_at_20_max": 0.514002,
141
+ "nauc_mrr_at_20_std": 0.298811,
142
+ "nauc_mrr_at_20_diff1": 0.35712,
143
+ "nauc_mrr_at_100_max": 0.515489,
144
+ "nauc_mrr_at_100_std": 0.301856,
145
+ "nauc_mrr_at_100_diff1": 0.356113,
146
+ "nauc_mrr_at_1000_max": 0.515166,
147
+ "nauc_mrr_at_1000_std": 0.301406,
148
+ "nauc_mrr_at_1000_diff1": 0.356091,
149
+ "hit_rate_at_1": 0.35294,
150
+ "hit_rate_at_3": 0.48297,
151
+ "hit_rate_at_5": 0.56347,
152
+ "hit_rate_at_10": 0.613,
153
+ "hit_rate_at_20": 0.68421,
154
+ "hit_rate_at_100": 0.80186,
155
+ "hit_rate_at_1000": 0.94427,
156
+ "main_score": 0.25542,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 6.685624837875366,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }
results/NQ.json ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dataset_revision": "b774495ed302d8c44a3a7ea25c90dbce03968f31",
3
+ "task_name": "NQ",
4
+ "mteb_version": "2.10.7",
5
+ "scores": {
6
+ "test": [
7
+ {
8
+ "ndcg_at_1": 0.21118,
9
+ "ndcg_at_3": 0.27942,
10
+ "ndcg_at_5": 0.31026,
11
+ "ndcg_at_10": 0.3393,
12
+ "ndcg_at_20": 0.36381,
13
+ "ndcg_at_100": 0.39738,
14
+ "ndcg_at_1000": 0.41611,
15
+ "map_at_1": 0.18451,
16
+ "map_at_3": 0.25175,
17
+ "map_at_5": 0.26956,
18
+ "map_at_10": 0.28255,
19
+ "map_at_20": 0.28999,
20
+ "map_at_100": 0.29503,
21
+ "map_at_1000": 0.29575,
22
+ "recall_at_1": 0.18451,
23
+ "recall_at_3": 0.33157,
24
+ "recall_at_5": 0.40346,
25
+ "recall_at_10": 0.48795,
26
+ "recall_at_20": 0.57918,
27
+ "recall_at_100": 0.75135,
28
+ "recall_at_1000": 0.89424,
29
+ "accuracy": 0.18451,
30
+ "precision_at_1": 0.21118,
31
+ "precision_at_3": 0.12852,
32
+ "precision_at_5": 0.09421,
33
+ "precision_at_10": 0.05771,
34
+ "precision_at_20": 0.03457,
35
+ "precision_at_100": 0.00906,
36
+ "precision_at_1000": 0.00108,
37
+ "mrr_at_1": 0.211182,
38
+ "mrr_at_3": 0.280562,
39
+ "mrr_at_5": 0.297451,
40
+ "mrr_at_10": 0.308598,
41
+ "mrr_at_20": 0.314703,
42
+ "mrr_at_100": 0.318792,
43
+ "mrr_at_1000": 0.319378,
44
+ "nauc_ndcg_at_1_max": 0.227658,
45
+ "nauc_ndcg_at_1_std": 0.042089,
46
+ "nauc_ndcg_at_1_diff1": 0.334397,
47
+ "nauc_ndcg_at_3_max": 0.233831,
48
+ "nauc_ndcg_at_3_std": 0.047104,
49
+ "nauc_ndcg_at_3_diff1": 0.280007,
50
+ "nauc_ndcg_at_5_max": 0.247242,
51
+ "nauc_ndcg_at_5_std": 0.05021,
52
+ "nauc_ndcg_at_5_diff1": 0.279189,
53
+ "nauc_ndcg_at_10_max": 0.261999,
54
+ "nauc_ndcg_at_10_std": 0.06917,
55
+ "nauc_ndcg_at_10_diff1": 0.270863,
56
+ "nauc_ndcg_at_20_max": 0.273322,
57
+ "nauc_ndcg_at_20_std": 0.0828,
58
+ "nauc_ndcg_at_20_diff1": 0.270226,
59
+ "nauc_ndcg_at_100_max": 0.281507,
60
+ "nauc_ndcg_at_100_std": 0.102193,
61
+ "nauc_ndcg_at_100_diff1": 0.268713,
62
+ "nauc_ndcg_at_1000_max": 0.278689,
63
+ "nauc_ndcg_at_1000_std": 0.098306,
64
+ "nauc_ndcg_at_1000_diff1": 0.271378,
65
+ "nauc_map_at_1_max": 0.206711,
66
+ "nauc_map_at_1_std": 0.02038,
67
+ "nauc_map_at_1_diff1": 0.335062,
68
+ "nauc_map_at_3_max": 0.226581,
69
+ "nauc_map_at_3_std": 0.037068,
70
+ "nauc_map_at_3_diff1": 0.291937,
71
+ "nauc_map_at_5_max": 0.235828,
72
+ "nauc_map_at_5_std": 0.039344,
73
+ "nauc_map_at_5_diff1": 0.291742,
74
+ "nauc_map_at_10_max": 0.243334,
75
+ "nauc_map_at_10_std": 0.04915,
76
+ "nauc_map_at_10_diff1": 0.287985,
77
+ "nauc_map_at_20_max": 0.247297,
78
+ "nauc_map_at_20_std": 0.053609,
79
+ "nauc_map_at_20_diff1": 0.287867,
80
+ "nauc_map_at_100_max": 0.24853,
81
+ "nauc_map_at_100_std": 0.056405,
82
+ "nauc_map_at_100_diff1": 0.287758,
83
+ "nauc_map_at_1000_max": 0.248418,
84
+ "nauc_map_at_1000_std": 0.056289,
85
+ "nauc_map_at_1000_diff1": 0.287838,
86
+ "nauc_recall_at_1_max": 0.206711,
87
+ "nauc_recall_at_1_std": 0.02038,
88
+ "nauc_recall_at_1_diff1": 0.335062,
89
+ "nauc_recall_at_3_max": 0.228506,
90
+ "nauc_recall_at_3_std": 0.05318,
91
+ "nauc_recall_at_3_diff1": 0.236679,
92
+ "nauc_recall_at_5_max": 0.256474,
93
+ "nauc_recall_at_5_std": 0.058588,
94
+ "nauc_recall_at_5_diff1": 0.236771,
95
+ "nauc_recall_at_10_max": 0.294287,
96
+ "nauc_recall_at_10_std": 0.106912,
97
+ "nauc_recall_at_10_diff1": 0.213575,
98
+ "nauc_recall_at_20_max": 0.337796,
99
+ "nauc_recall_at_20_std": 0.157417,
100
+ "nauc_recall_at_20_diff1": 0.210217,
101
+ "nauc_recall_at_100_max": 0.42019,
102
+ "nauc_recall_at_100_std": 0.318116,
103
+ "nauc_recall_at_100_diff1": 0.17753,
104
+ "nauc_recall_at_1000_max": 0.522955,
105
+ "nauc_recall_at_1000_std": 0.4669,
106
+ "nauc_recall_at_1000_diff1": 0.152906,
107
+ "nauc_precision_at_1_max": 0.227658,
108
+ "nauc_precision_at_1_std": 0.042089,
109
+ "nauc_precision_at_1_diff1": 0.334397,
110
+ "nauc_precision_at_3_max": 0.265897,
111
+ "nauc_precision_at_3_std": 0.081893,
112
+ "nauc_precision_at_3_diff1": 0.247919,
113
+ "nauc_precision_at_5_max": 0.288929,
114
+ "nauc_precision_at_5_std": 0.090039,
115
+ "nauc_precision_at_5_diff1": 0.240097,
116
+ "nauc_precision_at_10_max": 0.314827,
117
+ "nauc_precision_at_10_std": 0.145096,
118
+ "nauc_precision_at_10_diff1": 0.200927,
119
+ "nauc_precision_at_20_max": 0.331121,
120
+ "nauc_precision_at_20_std": 0.187749,
121
+ "nauc_precision_at_20_diff1": 0.171022,
122
+ "nauc_precision_at_100_max": 0.319739,
123
+ "nauc_precision_at_100_std": 0.275434,
124
+ "nauc_precision_at_100_diff1": 0.100659,
125
+ "nauc_precision_at_1000_max": 0.254236,
126
+ "nauc_precision_at_1000_std": 0.262596,
127
+ "nauc_precision_at_1000_diff1": 0.031093,
128
+ "nauc_mrr_at_1_max": 0.227658,
129
+ "nauc_mrr_at_1_std": 0.042089,
130
+ "nauc_mrr_at_1_diff1": 0.334397,
131
+ "nauc_mrr_at_3_max": 0.241555,
132
+ "nauc_mrr_at_3_std": 0.057749,
133
+ "nauc_mrr_at_3_diff1": 0.292952,
134
+ "nauc_mrr_at_5_max": 0.246783,
135
+ "nauc_mrr_at_5_std": 0.058835,
136
+ "nauc_mrr_at_5_diff1": 0.291366,
137
+ "nauc_mrr_at_10_max": 0.251387,
138
+ "nauc_mrr_at_10_std": 0.06434,
139
+ "nauc_mrr_at_10_diff1": 0.288275,
140
+ "nauc_mrr_at_20_max": 0.253639,
141
+ "nauc_mrr_at_20_std": 0.067249,
142
+ "nauc_mrr_at_20_diff1": 0.288323,
143
+ "nauc_mrr_at_100_max": 0.254222,
144
+ "nauc_mrr_at_100_std": 0.06871,
145
+ "nauc_mrr_at_100_diff1": 0.288576,
146
+ "nauc_mrr_at_1000_max": 0.254156,
147
+ "nauc_mrr_at_1000_std": 0.068587,
148
+ "nauc_mrr_at_1000_diff1": 0.288706,
149
+ "hit_rate_at_1": 0.21118,
150
+ "hit_rate_at_3": 0.36993,
151
+ "hit_rate_at_5": 0.4438,
152
+ "hit_rate_at_10": 0.52636,
153
+ "hit_rate_at_20": 0.61327,
154
+ "hit_rate_at_100": 0.77375,
155
+ "hit_rate_at_1000": 0.9073,
156
+ "main_score": 0.3393,
157
+ "hf_subset": "default",
158
+ "languages": [
159
+ "eng-Latn"
160
+ ]
161
+ }
162
+ ]
163
+ },
164
+ "evaluation_time": 2799.932240962982,
165
+ "kg_co2_emissions": null,
166
+ "date": null
167
+ }