Haon-Chen commited on
Commit
cc52c73
·
1 Parent(s): c590de1

mteb results

Browse files
Files changed (1) hide show
  1. README.md +1728 -0
README.md CHANGED
@@ -1,3 +1,1731 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - mteb
4
+ - transformers
5
+ model-index:
6
+ - name: speed-embedding-7b-instruct
7
+ results:
8
+ - task:
9
+ type: Classification
10
+ dataset:
11
+ type: mteb/amazon_counterfactual
12
+ name: MTEB AmazonCounterfactualClassification (en)
13
+ config: en
14
+ split: test
15
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
16
+ metrics:
17
+ - type: accuracy
18
+ value: 76.67164179104478
19
+ - type: ap
20
+ value: 39.07181577576136
21
+ - type: f1
22
+ value: 70.25085237742982
23
+ - task:
24
+ type: Classification
25
+ dataset:
26
+ type: mteb/amazon_polarity
27
+ name: MTEB AmazonPolarityClassification
28
+ config: default
29
+ split: test
30
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
31
+ metrics:
32
+ - type: accuracy
33
+ value: 96.1775
34
+ - type: ap
35
+ value: 94.84308844303422
36
+ - type: f1
37
+ value: 96.17546959843244
38
+ - task:
39
+ type: Classification
40
+ dataset:
41
+ type: mteb/amazon_reviews_multi
42
+ name: MTEB AmazonReviewsClassification (en)
43
+ config: en
44
+ split: test
45
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
46
+ metrics:
47
+ - type: accuracy
48
+ value: 56.278000000000006
49
+ - type: f1
50
+ value: 55.45101875980304
51
+ - task:
52
+ type: Retrieval
53
+ dataset:
54
+ type: arguana
55
+ name: MTEB ArguAna
56
+ config: default
57
+ split: test
58
+ revision: None
59
+ metrics:
60
+ - type: ndcg_at_1
61
+ value: 33.642
62
+ - type: ndcg_at_3
63
+ value: 49.399
64
+ - type: ndcg_at_5
65
+ value: 54.108999999999995
66
+ - type: ndcg_at_10
67
+ value: 59.294999999999995
68
+ - type: ndcg_at_100
69
+ value: 62.015
70
+ - type: map_at_1
71
+ value: 33.642
72
+ - type: map_at_3
73
+ value: 45.507
74
+ - type: map_at_5
75
+ value: 48.1
76
+ - type: map_at_10
77
+ value: 50.248000000000005
78
+ - type: map_at_100
79
+ value: 50.954
80
+ - type: recall_at_1
81
+ value: 33.642
82
+ - type: recall_at_3
83
+ value: 60.669
84
+ - type: recall_at_5
85
+ value: 72.191
86
+ - type: recall_at_10
87
+ value: 88.193
88
+ - type: recall_at_100
89
+ value: 99.431
90
+ - type: precision_at_1
91
+ value: 33.642
92
+ - type: precision_at_3
93
+ value: 20.223
94
+ - type: precision_at_5
95
+ value: 14.438
96
+ - type: precision_at_10
97
+ value: 8.819
98
+ - type: precision_at_100
99
+ value: 0.9939999999999999
100
+ - type: mrr_at_1
101
+ value: 33.997
102
+ - type: mrr_at_3
103
+ value: 45.614
104
+ - type: mrr_at_5
105
+ value: 48.263
106
+ - type: mrr_at_10
107
+ value: 50.388999999999996
108
+ - type: mrr_at_100
109
+ value: 51.102000000000004
110
+ - task:
111
+ type: Clustering
112
+ dataset:
113
+ type: mteb/arxiv-clustering-p2p
114
+ name: MTEB ArxivClusteringP2P
115
+ config: default
116
+ split: test
117
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
118
+ metrics:
119
+ - type: v_measure
120
+ value: 51.1249344529392
121
+ - task:
122
+ type: Clustering
123
+ dataset:
124
+ type: mteb/arxiv-clustering-s2s
125
+ name: MTEB ArxivClusteringS2S
126
+ config: default
127
+ split: test
128
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
129
+ metrics:
130
+ - type: v_measure
131
+ value: 47.01575217563573
132
+ - task:
133
+ type: Reranking
134
+ dataset:
135
+ type: mteb/askubuntudupquestions-reranking
136
+ name: MTEB AskUbuntuDupQuestions
137
+ config: default
138
+ split: test
139
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
140
+ metrics:
141
+ - type: map
142
+ value: 67.2259454062751
143
+ - type: mrr
144
+ value: 79.37508244294948
145
+ - task:
146
+ type: STS
147
+ dataset:
148
+ type: mteb/biosses-sts
149
+ name: MTEB BIOSSES
150
+ config: default
151
+ split: test
152
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
153
+ metrics:
154
+ - type: cos_sim_pearson
155
+ value: 89.5312396547344
156
+ - type: cos_sim_spearman
157
+ value: 87.1447567367366
158
+ - type: euclidean_pearson
159
+ value: 88.67110804544821
160
+ - type: euclidean_spearman
161
+ value: 87.1447567367366
162
+ - type: manhattan_pearson
163
+ value: 89.06983994154335
164
+ - type: manhattan_spearman
165
+ value: 87.59115245033443
166
+ - task:
167
+ type: Classification
168
+ dataset:
169
+ type: mteb/banking77
170
+ name: MTEB Banking77Classification
171
+ config: default
172
+ split: test
173
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
174
+ metrics:
175
+ - type: accuracy
176
+ value: 88.63636363636364
177
+ - type: f1
178
+ value: 88.58740097633193
179
+ - task:
180
+ type: Clustering
181
+ dataset:
182
+ type: mteb/biorxiv-clustering-p2p
183
+ name: MTEB BiorxivClusteringP2P
184
+ config: default
185
+ split: test
186
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
187
+ metrics:
188
+ - type: v_measure
189
+ value: 41.99753263006505
190
+ - task:
191
+ type: Clustering
192
+ dataset:
193
+ type: mteb/biorxiv-clustering-s2s
194
+ name: MTEB BiorxivClusteringS2S
195
+ config: default
196
+ split: test
197
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
198
+ metrics:
199
+ - type: v_measure
200
+ value: 39.623067884052666
201
+ - task:
202
+ type: Retrieval
203
+ dataset:
204
+ type: BeIR/cqadupstack
205
+ name: MTEB CQADupstackRetrieval
206
+ config: default
207
+ split: test
208
+ revision: None
209
+ metrics:
210
+ - type: ndcg_at_1
211
+ value: 30.904666666666664
212
+ - type: ndcg_at_3
213
+ value: 36.32808333333333
214
+ - type: ndcg_at_5
215
+ value: 38.767250000000004
216
+ - type: ndcg_at_10
217
+ value: 41.62008333333333
218
+ - type: ndcg_at_100
219
+ value: 47.118083333333324
220
+ - type: map_at_1
221
+ value: 25.7645
222
+ - type: map_at_3
223
+ value: 32.6235
224
+ - type: map_at_5
225
+ value: 34.347
226
+ - type: map_at_10
227
+ value: 35.79658333333333
228
+ - type: map_at_100
229
+ value: 37.10391666666666
230
+ - type: recall_at_1
231
+ value: 25.7645
232
+ - type: recall_at_3
233
+ value: 39.622666666666674
234
+ - type: recall_at_5
235
+ value: 45.938750000000006
236
+ - type: recall_at_10
237
+ value: 54.43816666666667
238
+ - type: recall_at_100
239
+ value: 78.66183333333333
240
+ - type: precision_at_1
241
+ value: 30.904666666666664
242
+ - type: precision_at_3
243
+ value: 17.099083333333333
244
+ - type: precision_at_5
245
+ value: 12.278416666666669
246
+ - type: precision_at_10
247
+ value: 7.573083333333335
248
+ - type: precision_at_100
249
+ value: 1.22275
250
+ - type: mrr_at_1
251
+ value: 30.904666666666664
252
+ - type: mrr_at_3
253
+ value: 37.458333333333336
254
+ - type: mrr_at_5
255
+ value: 38.97333333333333
256
+ - type: mrr_at_10
257
+ value: 40.10316666666666
258
+ - type: mrr_at_100
259
+ value: 41.004250000000006
260
+ - task:
261
+ type: Retrieval
262
+ dataset:
263
+ type: climate-fever
264
+ name: MTEB ClimateFEVER
265
+ config: default
266
+ split: test
267
+ revision: None
268
+ metrics:
269
+ - type: ndcg_at_1
270
+ value: 38.046
271
+ - type: ndcg_at_3
272
+ value: 31.842
273
+ - type: ndcg_at_5
274
+ value: 33.698
275
+ - type: ndcg_at_10
276
+ value: 37.765
277
+ - type: ndcg_at_100
278
+ value: 44.998
279
+ - type: map_at_1
280
+ value: 16.682
281
+ - type: map_at_3
282
+ value: 23.624000000000002
283
+ - type: map_at_5
284
+ value: 25.812
285
+ - type: map_at_10
286
+ value: 28.017999999999997
287
+ - type: map_at_100
288
+ value: 30.064999999999998
289
+ - type: recall_at_1
290
+ value: 16.682
291
+ - type: recall_at_3
292
+ value: 28.338
293
+ - type: recall_at_5
294
+ value: 34.486
295
+ - type: recall_at_10
296
+ value: 43.474000000000004
297
+ - type: recall_at_100
298
+ value: 67.984
299
+ - type: precision_at_1
300
+ value: 38.046
301
+ - type: precision_at_3
302
+ value: 23.779
303
+ - type: precision_at_5
304
+ value: 17.849999999999998
305
+ - type: precision_at_10
306
+ value: 11.642
307
+ - type: precision_at_100
308
+ value: 1.9429999999999998
309
+ - type: mrr_at_1
310
+ value: 38.046
311
+ - type: mrr_at_3
312
+ value: 46.764
313
+ - type: mrr_at_5
314
+ value: 48.722
315
+ - type: mrr_at_10
316
+ value: 49.976
317
+ - type: mrr_at_100
318
+ value: 50.693999999999996
319
+ - task:
320
+ type: Retrieval
321
+ dataset:
322
+ type: dbpedia-entity
323
+ name: MTEB DBPedia
324
+ config: default
325
+ split: test
326
+ revision: None
327
+ metrics:
328
+ - type: ndcg_at_1
329
+ value: 63.24999999999999
330
+ - type: ndcg_at_3
331
+ value: 54.005
332
+ - type: ndcg_at_5
333
+ value: 51.504000000000005
334
+ - type: ndcg_at_10
335
+ value: 49.738
336
+ - type: ndcg_at_100
337
+ value: 54.754000000000005
338
+ - type: map_at_1
339
+ value: 10.639
340
+ - type: map_at_3
341
+ value: 16.726
342
+ - type: map_at_5
343
+ value: 20.101
344
+ - type: map_at_10
345
+ value: 24.569
346
+ - type: map_at_100
347
+ value: 35.221999999999994
348
+ - type: recall_at_1
349
+ value: 10.639
350
+ - type: recall_at_3
351
+ value: 17.861
352
+ - type: recall_at_5
353
+ value: 22.642
354
+ - type: recall_at_10
355
+ value: 30.105999999999998
356
+ - type: recall_at_100
357
+ value: 60.92999999999999
358
+ - type: precision_at_1
359
+ value: 75.0
360
+ - type: precision_at_3
361
+ value: 58.083
362
+ - type: precision_at_5
363
+ value: 50.0
364
+ - type: precision_at_10
365
+ value: 40.35
366
+ - type: precision_at_100
367
+ value: 12.659999999999998
368
+ - type: mrr_at_1
369
+ value: 75.0
370
+ - type: mrr_at_3
371
+ value: 80.042
372
+ - type: mrr_at_5
373
+ value: 80.779
374
+ - type: mrr_at_10
375
+ value: 81.355
376
+ - type: mrr_at_100
377
+ value: 81.58
378
+ - task:
379
+ type: Classification
380
+ dataset:
381
+ type: mteb/emotion
382
+ name: MTEB EmotionClassification
383
+ config: default
384
+ split: test
385
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
386
+ metrics:
387
+ - type: accuracy
388
+ value: 51.025
389
+ - type: f1
390
+ value: 47.08253474922065
391
+ - task:
392
+ type: Retrieval
393
+ dataset:
394
+ type: fever
395
+ name: MTEB FEVER
396
+ config: default
397
+ split: test
398
+ revision: None
399
+ metrics:
400
+ - type: ndcg_at_1
401
+ value: 82.163
402
+ - type: ndcg_at_3
403
+ value: 86.835
404
+ - type: ndcg_at_5
405
+ value: 87.802
406
+ - type: ndcg_at_10
407
+ value: 88.529
408
+ - type: ndcg_at_100
409
+ value: 89.17
410
+ - type: map_at_1
411
+ value: 76.335
412
+ - type: map_at_3
413
+ value: 83.91499999999999
414
+ - type: map_at_5
415
+ value: 84.64500000000001
416
+ - type: map_at_10
417
+ value: 85.058
418
+ - type: map_at_100
419
+ value: 85.257
420
+ - type: recall_at_1
421
+ value: 76.335
422
+ - type: recall_at_3
423
+ value: 90.608
424
+ - type: recall_at_5
425
+ value: 93.098
426
+ - type: recall_at_10
427
+ value: 95.173
428
+ - type: recall_at_100
429
+ value: 97.59299999999999
430
+ - type: precision_at_1
431
+ value: 82.163
432
+ - type: precision_at_3
433
+ value: 33.257999999999996
434
+ - type: precision_at_5
435
+ value: 20.654
436
+ - type: precision_at_10
437
+ value: 10.674999999999999
438
+ - type: precision_at_100
439
+ value: 1.122
440
+ - type: mrr_at_1
441
+ value: 82.163
442
+ - type: mrr_at_3
443
+ value: 88.346
444
+ - type: mrr_at_5
445
+ value: 88.791
446
+ - type: mrr_at_10
447
+ value: 88.97699999999999
448
+ - type: mrr_at_100
449
+ value: 89.031
450
+ - task:
451
+ type: Retrieval
452
+ dataset:
453
+ type: fiqa
454
+ name: MTEB FiQA2018
455
+ config: default
456
+ split: test
457
+ revision: None
458
+ metrics:
459
+ - type: ndcg_at_1
460
+ value: 55.093
461
+ - type: ndcg_at_3
462
+ value: 52.481
463
+ - type: ndcg_at_5
464
+ value: 53.545
465
+ - type: ndcg_at_10
466
+ value: 56.053
467
+ - type: ndcg_at_100
468
+ value: 62.53999999999999
469
+ - type: map_at_1
470
+ value: 29.189999999999998
471
+ - type: map_at_3
472
+ value: 42.603
473
+ - type: map_at_5
474
+ value: 45.855000000000004
475
+ - type: map_at_10
476
+ value: 48.241
477
+ - type: map_at_100
478
+ value: 50.300999999999995
479
+ - type: recall_at_1
480
+ value: 29.189999999999998
481
+ - type: recall_at_3
482
+ value: 47.471999999999994
483
+ - type: recall_at_5
484
+ value: 54.384
485
+ - type: recall_at_10
486
+ value: 62.731
487
+ - type: recall_at_100
488
+ value: 86.02300000000001
489
+ - type: precision_at_1
490
+ value: 55.093
491
+ - type: precision_at_3
492
+ value: 34.979
493
+ - type: precision_at_5
494
+ value: 25.278
495
+ - type: precision_at_10
496
+ value: 15.231
497
+ - type: precision_at_100
498
+ value: 2.2190000000000003
499
+ - type: mrr_at_1
500
+ value: 55.093
501
+ - type: mrr_at_3
502
+ value: 61.317
503
+ - type: mrr_at_5
504
+ value: 62.358999999999995
505
+ - type: mrr_at_10
506
+ value: 63.165000000000006
507
+ - type: mrr_at_100
508
+ value: 63.81
509
+ - task:
510
+ type: Retrieval
511
+ dataset:
512
+ type: hotpotqa
513
+ name: MTEB HotpotQA
514
+ config: default
515
+ split: test
516
+ revision: None
517
+ metrics:
518
+ - type: ndcg_at_1
519
+ value: 78.866
520
+ - type: ndcg_at_3
521
+ value: 70.128
522
+ - type: ndcg_at_5
523
+ value: 73.017
524
+ - type: ndcg_at_10
525
+ value: 75.166
526
+ - type: ndcg_at_100
527
+ value: 77.97500000000001
528
+ - type: map_at_1
529
+ value: 39.433
530
+ - type: map_at_3
531
+ value: 64.165
532
+ - type: map_at_5
533
+ value: 66.503
534
+ - type: map_at_10
535
+ value: 67.822
536
+ - type: map_at_100
537
+ value: 68.675
538
+ - type: recall_at_1
539
+ value: 39.433
540
+ - type: recall_at_3
541
+ value: 69.03399999999999
542
+ - type: recall_at_5
543
+ value: 74.74
544
+ - type: recall_at_10
545
+ value: 80.108
546
+ - type: recall_at_100
547
+ value: 90.81700000000001
548
+ - type: precision_at_1
549
+ value: 78.866
550
+ - type: precision_at_3
551
+ value: 46.022999999999996
552
+ - type: precision_at_5
553
+ value: 29.896
554
+ - type: precision_at_10
555
+ value: 16.022
556
+ - type: precision_at_100
557
+ value: 1.8159999999999998
558
+ - type: mrr_at_1
559
+ value: 78.866
560
+ - type: mrr_at_3
561
+ value: 83.91
562
+ - type: mrr_at_5
563
+ value: 84.473
564
+ - type: mrr_at_10
565
+ value: 84.769
566
+ - type: mrr_at_100
567
+ value: 84.953
568
+ - task:
569
+ type: Classification
570
+ dataset:
571
+ type: mteb/imdb
572
+ name: MTEB ImdbClassification
573
+ config: default
574
+ split: test
575
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
576
+ metrics:
577
+ - type: accuracy
578
+ value: 94.87799999999999
579
+ - type: ap
580
+ value: 92.5831019543702
581
+ - type: f1
582
+ value: 94.87675087619891
583
+ - task:
584
+ type: Retrieval
585
+ dataset:
586
+ type: msmarco
587
+ name: MTEB MSMARCO
588
+ config: default
589
+ split: test
590
+ revision: None
591
+ metrics:
592
+ - type: ndcg_at_1
593
+ value: 23.195
594
+ - type: ndcg_at_3
595
+ value: 34.419
596
+ - type: ndcg_at_5
597
+ value: 38.665
598
+ - type: ndcg_at_10
599
+ value: 42.549
600
+ - type: ndcg_at_100
601
+ value: 48.256
602
+ - type: map_at_1
603
+ value: 22.508
604
+ - type: map_at_3
605
+ value: 31.346
606
+ - type: map_at_5
607
+ value: 33.73
608
+ - type: map_at_10
609
+ value: 35.365
610
+ - type: map_at_100
611
+ value: 36.568
612
+ - type: recall_at_1
613
+ value: 22.508
614
+ - type: recall_at_3
615
+ value: 42.63
616
+ - type: recall_at_5
617
+ value: 52.827999999999996
618
+ - type: recall_at_10
619
+ value: 64.645
620
+ - type: recall_at_100
621
+ value: 90.852
622
+ - type: precision_at_1
623
+ value: 23.195
624
+ - type: precision_at_3
625
+ value: 14.752
626
+ - type: precision_at_5
627
+ value: 11.0
628
+ - type: precision_at_10
629
+ value: 6.755
630
+ - type: precision_at_100
631
+ value: 0.96
632
+ - type: mrr_at_1
633
+ value: 23.195
634
+ - type: mrr_at_3
635
+ value: 32.042
636
+ - type: mrr_at_5
637
+ value: 34.388000000000005
638
+ - type: mrr_at_10
639
+ value: 35.974000000000004
640
+ - type: mrr_at_100
641
+ value: 37.114000000000004
642
+ - task:
643
+ type: Classification
644
+ dataset:
645
+ type: mteb/mtop_domain
646
+ name: MTEB MTOPDomainClassification (en)
647
+ config: en
648
+ split: test
649
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
650
+ metrics:
651
+ - type: accuracy
652
+ value: 95.84587323301413
653
+ - type: f1
654
+ value: 95.69948889844318
655
+ - task:
656
+ type: Classification
657
+ dataset:
658
+ type: mteb/mtop_intent
659
+ name: MTEB MTOPIntentClassification (en)
660
+ config: en
661
+ split: test
662
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
663
+ metrics:
664
+ - type: accuracy
665
+ value: 87.08162334701322
666
+ - type: f1
667
+ value: 72.237783326283
668
+ - task:
669
+ type: Classification
670
+ dataset:
671
+ type: mteb/amazon_massive_intent
672
+ name: MTEB MassiveIntentClassification (en)
673
+ config: en
674
+ split: test
675
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
676
+ metrics:
677
+ - type: accuracy
678
+ value: 80.19502353732346
679
+ - type: f1
680
+ value: 77.732184986995
681
+ - task:
682
+ type: Classification
683
+ dataset:
684
+ type: mteb/amazon_massive_scenario
685
+ name: MTEB MassiveScenarioClassification (en)
686
+ config: en
687
+ split: test
688
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
689
+ metrics:
690
+ - type: accuracy
691
+ value: 82.26630800268998
692
+ - type: f1
693
+ value: 82.12747916248556
694
+ - task:
695
+ type: Clustering
696
+ dataset:
697
+ type: mteb/medrxiv-clustering-p2p
698
+ name: MTEB MedrxivClusteringP2P
699
+ config: default
700
+ split: test
701
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
702
+ metrics:
703
+ - type: v_measure
704
+ value: 36.95240450167033
705
+ - task:
706
+ type: Clustering
707
+ dataset:
708
+ type: mteb/medrxiv-clustering-s2s
709
+ name: MTEB MedrxivClusteringS2S
710
+ config: default
711
+ split: test
712
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
713
+ metrics:
714
+ - type: v_measure
715
+ value: 36.27758530931266
716
+ - task:
717
+ type: Reranking
718
+ dataset:
719
+ type: mteb/mind_small
720
+ name: MTEB MindSmallReranking
721
+ config: default
722
+ split: test
723
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
724
+ metrics:
725
+ - type: map
726
+ value: 33.35707665482982
727
+ - type: mrr
728
+ value: 34.60987842278547
729
+ - task:
730
+ type: Retrieval
731
+ dataset:
732
+ type: nfcorpus
733
+ name: MTEB NFCorpus
734
+ config: default
735
+ split: test
736
+ revision: None
737
+ metrics:
738
+ - type: ndcg_at_1
739
+ value: 47.522999999999996
740
+ - type: ndcg_at_3
741
+ value: 44.489000000000004
742
+ - type: ndcg_at_5
743
+ value: 41.92
744
+ - type: ndcg_at_10
745
+ value: 38.738
746
+ - type: ndcg_at_100
747
+ value: 35.46
748
+ - type: map_at_1
749
+ value: 5.357
750
+ - type: map_at_3
751
+ value: 10.537
752
+ - type: map_at_5
753
+ value: 12.062000000000001
754
+ - type: map_at_10
755
+ value: 14.264
756
+ - type: map_at_100
757
+ value: 18.442
758
+ - type: recall_at_1
759
+ value: 5.357
760
+ - type: recall_at_3
761
+ value: 12.499
762
+ - type: recall_at_5
763
+ value: 14.809
764
+ - type: recall_at_10
765
+ value: 18.765
766
+ - type: recall_at_100
767
+ value: 36.779
768
+ - type: precision_at_1
769
+ value: 49.226
770
+ - type: precision_at_3
771
+ value: 41.899
772
+ - type: precision_at_5
773
+ value: 36.718
774
+ - type: precision_at_10
775
+ value: 29.287999999999997
776
+ - type: precision_at_100
777
+ value: 9.22
778
+ - type: mrr_at_1
779
+ value: 49.845
780
+ - type: mrr_at_3
781
+ value: 57.121
782
+ - type: mrr_at_5
783
+ value: 58.172999999999995
784
+ - type: mrr_at_10
785
+ value: 58.906000000000006
786
+ - type: mrr_at_100
787
+ value: 59.486000000000004
788
+ - task:
789
+ type: Retrieval
790
+ dataset:
791
+ type: nq
792
+ name: MTEB NQ
793
+ config: default
794
+ split: test
795
+ revision: None
796
+ metrics:
797
+ - type: ndcg_at_1
798
+ value: 42.815999999999995
799
+ - type: ndcg_at_3
800
+ value: 53.766999999999996
801
+ - type: ndcg_at_5
802
+ value: 57.957
803
+ - type: ndcg_at_10
804
+ value: 61.661
805
+ - type: ndcg_at_100
806
+ value: 65.218
807
+ - type: map_at_1
808
+ value: 38.364
809
+ - type: map_at_3
810
+ value: 49.782
811
+ - type: map_at_5
812
+ value: 52.319
813
+ - type: map_at_10
814
+ value: 54.07300000000001
815
+ - type: map_at_100
816
+ value: 54.983000000000004
817
+ - type: recall_at_1
818
+ value: 38.364
819
+ - type: recall_at_3
820
+ value: 61.744
821
+ - type: recall_at_5
822
+ value: 71.32300000000001
823
+ - type: recall_at_10
824
+ value: 82.015
825
+ - type: recall_at_100
826
+ value: 96.978
827
+ - type: precision_at_1
828
+ value: 42.815999999999995
829
+ - type: precision_at_3
830
+ value: 23.976
831
+ - type: precision_at_5
832
+ value: 16.866
833
+ - type: precision_at_10
834
+ value: 9.806
835
+ - type: precision_at_100
836
+ value: 1.1769999999999998
837
+ - type: mrr_at_1
838
+ value: 42.845
839
+ - type: mrr_at_3
840
+ value: 53.307
841
+ - type: mrr_at_5
842
+ value: 55.434000000000005
843
+ - type: mrr_at_10
844
+ value: 56.702
845
+ - type: mrr_at_100
846
+ value: 57.342000000000006
847
+ - task:
848
+ type: Retrieval
849
+ dataset:
850
+ type: quora
851
+ name: MTEB QuoraRetrieval
852
+ config: default
853
+ split: test
854
+ revision: None
855
+ metrics:
856
+ - type: ndcg_at_1
857
+ value: 82.46
858
+ - type: ndcg_at_3
859
+ value: 86.774
860
+ - type: ndcg_at_5
861
+ value: 88.256
862
+ - type: ndcg_at_10
863
+ value: 89.35
864
+ - type: ndcg_at_100
865
+ value: 90.46499999999999
866
+ - type: map_at_1
867
+ value: 71.562
868
+ - type: map_at_3
869
+ value: 82.948
870
+ - type: map_at_5
871
+ value: 84.786
872
+ - type: map_at_10
873
+ value: 85.82300000000001
874
+ - type: map_at_100
875
+ value: 86.453
876
+ - type: recall_at_1
877
+ value: 71.562
878
+ - type: recall_at_3
879
+ value: 88.51
880
+ - type: recall_at_5
881
+ value: 92.795
882
+ - type: recall_at_10
883
+ value: 95.998
884
+ - type: recall_at_100
885
+ value: 99.701
886
+ - type: precision_at_1
887
+ value: 82.46
888
+ - type: precision_at_3
889
+ value: 38.1
890
+ - type: precision_at_5
891
+ value: 24.990000000000002
892
+ - type: precision_at_10
893
+ value: 13.553999999999998
894
+ - type: precision_at_100
895
+ value: 1.539
896
+ - type: mrr_at_1
897
+ value: 82.43
898
+ - type: mrr_at_3
899
+ value: 87.653
900
+ - type: mrr_at_5
901
+ value: 88.26899999999999
902
+ - type: mrr_at_10
903
+ value: 88.505
904
+ - type: mrr_at_100
905
+ value: 88.601
906
+ - task:
907
+ type: Clustering
908
+ dataset:
909
+ type: mteb/reddit-clustering
910
+ name: MTEB RedditClustering
911
+ config: default
912
+ split: test
913
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
914
+ metrics:
915
+ - type: v_measure
916
+ value: 57.928338007609256
917
+ - task:
918
+ type: Clustering
919
+ dataset:
920
+ type: mteb/reddit-clustering-p2p
921
+ name: MTEB RedditClusteringP2P
922
+ config: default
923
+ split: test
924
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
925
+ metrics:
926
+ - type: v_measure
927
+ value: 65.28915417473826
928
+ - task:
929
+ type: Retrieval
930
+ dataset:
931
+ type: scidocs
932
+ name: MTEB SCIDOCS
933
+ config: default
934
+ split: test
935
+ revision: None
936
+ metrics:
937
+ - type: ndcg_at_1
938
+ value: 17.2
939
+ - type: ndcg_at_3
940
+ value: 15.856
941
+ - type: ndcg_at_5
942
+ value: 13.983
943
+ - type: ndcg_at_10
944
+ value: 16.628999999999998
945
+ - type: ndcg_at_100
946
+ value: 23.845
947
+ - type: map_at_1
948
+ value: 3.4750000000000005
949
+ - type: map_at_3
950
+ value: 6.905
951
+ - type: map_at_5
952
+ value: 8.254
953
+ - type: map_at_10
954
+ value: 9.474
955
+ - type: map_at_100
956
+ value: 11.242
957
+ - type: recall_at_1
958
+ value: 3.4750000000000005
959
+ - type: recall_at_3
960
+ value: 9.298
961
+ - type: recall_at_5
962
+ value: 12.817
963
+ - type: recall_at_10
964
+ value: 17.675
965
+ - type: recall_at_100
966
+ value: 38.678000000000004
967
+ - type: precision_at_1
968
+ value: 17.2
969
+ - type: precision_at_3
970
+ value: 15.299999999999999
971
+ - type: precision_at_5
972
+ value: 12.64
973
+ - type: precision_at_10
974
+ value: 8.72
975
+ - type: precision_at_100
976
+ value: 1.907
977
+ - type: mrr_at_1
978
+ value: 17.2
979
+ - type: mrr_at_3
980
+ value: 25.55
981
+ - type: mrr_at_5
982
+ value: 27.485
983
+ - type: mrr_at_10
984
+ value: 28.809
985
+ - type: mrr_at_100
986
+ value: 29.964000000000002
987
+ - task:
988
+ type: STS
989
+ dataset:
990
+ type: mteb/sickr-sts
991
+ name: MTEB SICK-R
992
+ config: default
993
+ split: test
994
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
995
+ metrics:
996
+ - type: cos_sim_pearson
997
+ value: 86.10434430387332
998
+ - type: cos_sim_spearman
999
+ value: 82.46041161692649
1000
+ - type: euclidean_pearson
1001
+ value: 83.4010092798136
1002
+ - type: euclidean_spearman
1003
+ value: 82.46040715308601
1004
+ - type: manhattan_pearson
1005
+ value: 83.6702316837156
1006
+ - type: manhattan_spearman
1007
+ value: 82.72271392303014
1008
+ - task:
1009
+ type: STS
1010
+ dataset:
1011
+ type: mteb/sts12-sts
1012
+ name: MTEB STS12
1013
+ config: default
1014
+ split: test
1015
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
1016
+ metrics:
1017
+ - type: cos_sim_pearson
1018
+ value: 87.3179771524676
1019
+ - type: cos_sim_spearman
1020
+ value: 80.15194914870666
1021
+ - type: euclidean_pearson
1022
+ value: 84.54005271342946
1023
+ - type: euclidean_spearman
1024
+ value: 80.15194914870666
1025
+ - type: manhattan_pearson
1026
+ value: 85.24410357734307
1027
+ - type: manhattan_spearman
1028
+ value: 80.78274673604562
1029
+ - task:
1030
+ type: STS
1031
+ dataset:
1032
+ type: mteb/sts13-sts
1033
+ name: MTEB STS13
1034
+ config: default
1035
+ split: test
1036
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1037
+ metrics:
1038
+ - type: cos_sim_pearson
1039
+ value: 89.2691354894402
1040
+ - type: cos_sim_spearman
1041
+ value: 89.94300436293618
1042
+ - type: euclidean_pearson
1043
+ value: 89.5600067781475
1044
+ - type: euclidean_spearman
1045
+ value: 89.942989691344
1046
+ - type: manhattan_pearson
1047
+ value: 89.80327997794308
1048
+ - type: manhattan_spearman
1049
+ value: 90.3964860275568
1050
+ - task:
1051
+ type: STS
1052
+ dataset:
1053
+ type: mteb/sts14-sts
1054
+ name: MTEB STS14
1055
+ config: default
1056
+ split: test
1057
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
1058
+ metrics:
1059
+ - type: cos_sim_pearson
1060
+ value: 87.68003396295498
1061
+ - type: cos_sim_spearman
1062
+ value: 86.23848649310362
1063
+ - type: euclidean_pearson
1064
+ value: 87.0702308813695
1065
+ - type: euclidean_spearman
1066
+ value: 86.23848649310362
1067
+ - type: manhattan_pearson
1068
+ value: 87.24495415360472
1069
+ - type: manhattan_spearman
1070
+ value: 86.58198464997109
1071
+ - task:
1072
+ type: STS
1073
+ dataset:
1074
+ type: mteb/sts15-sts
1075
+ name: MTEB STS15
1076
+ config: default
1077
+ split: test
1078
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
1079
+ metrics:
1080
+ - type: cos_sim_pearson
1081
+ value: 90.25643329096215
1082
+ - type: cos_sim_spearman
1083
+ value: 91.19520084590636
1084
+ - type: euclidean_pearson
1085
+ value: 90.68579446788728
1086
+ - type: euclidean_spearman
1087
+ value: 91.19519611831312
1088
+ - type: manhattan_pearson
1089
+ value: 90.83476867273104
1090
+ - type: manhattan_spearman
1091
+ value: 91.4569817842705
1092
+ - task:
1093
+ type: STS
1094
+ dataset:
1095
+ type: mteb/sts16-sts
1096
+ name: MTEB STS16
1097
+ config: default
1098
+ split: test
1099
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
1100
+ metrics:
1101
+ - type: cos_sim_pearson
1102
+ value: 86.41175694023282
1103
+ - type: cos_sim_spearman
1104
+ value: 88.18744495989392
1105
+ - type: euclidean_pearson
1106
+ value: 87.60085709987156
1107
+ - type: euclidean_spearman
1108
+ value: 88.18773792681107
1109
+ - type: manhattan_pearson
1110
+ value: 87.83199472909764
1111
+ - type: manhattan_spearman
1112
+ value: 88.45824161471776
1113
+ - task:
1114
+ type: STS
1115
+ dataset:
1116
+ type: mteb/sts17-crosslingual-sts
1117
+ name: MTEB STS17 (en-en)
1118
+ config: en-en
1119
+ split: test
1120
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
1121
+ metrics:
1122
+ - type: cos_sim_pearson
1123
+ value: 91.78311335565503
1124
+ - type: cos_sim_spearman
1125
+ value: 91.93416269793802
1126
+ - type: euclidean_pearson
1127
+ value: 91.84163160890154
1128
+ - type: euclidean_spearman
1129
+ value: 91.93416269793802
1130
+ - type: manhattan_pearson
1131
+ value: 91.77053255749301
1132
+ - type: manhattan_spearman
1133
+ value: 91.67392623286098
1134
+ - task:
1135
+ type: STS
1136
+ dataset:
1137
+ type: mteb/sts22-crosslingual-sts
1138
+ name: MTEB STS22 (en)
1139
+ config: en
1140
+ split: test
1141
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
1142
+ metrics:
1143
+ - type: cos_sim_pearson
1144
+ value: 68.2137857919086
1145
+ - type: cos_sim_spearman
1146
+ value: 68.31928639693375
1147
+ - type: euclidean_pearson
1148
+ value: 69.96072053688385
1149
+ - type: euclidean_spearman
1150
+ value: 68.31928639693375
1151
+ - type: manhattan_pearson
1152
+ value: 70.47736299273389
1153
+ - type: manhattan_spearman
1154
+ value: 68.72439259356818
1155
+ - task:
1156
+ type: STS
1157
+ dataset:
1158
+ type: mteb/stsbenchmark-sts
1159
+ name: MTEB STSBenchmark
1160
+ config: default
1161
+ split: test
1162
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
1163
+ metrics:
1164
+ - type: cos_sim_pearson
1165
+ value: 88.16092476703817
1166
+ - type: cos_sim_spearman
1167
+ value: 89.20507562822989
1168
+ - type: euclidean_pearson
1169
+ value: 88.91358225424611
1170
+ - type: euclidean_spearman
1171
+ value: 89.20505548241839
1172
+ - type: manhattan_pearson
1173
+ value: 88.98787306839809
1174
+ - type: manhattan_spearman
1175
+ value: 89.37338458483269
1176
+ - task:
1177
+ type: Reranking
1178
+ dataset:
1179
+ type: mteb/scidocs-reranking
1180
+ name: MTEB SciDocsRR
1181
+ config: default
1182
+ split: test
1183
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
1184
+ metrics:
1185
+ - type: map
1186
+ value: 87.29108971888714
1187
+ - type: mrr
1188
+ value: 96.62042024787124
1189
+ - task:
1190
+ type: Retrieval
1191
+ dataset:
1192
+ type: scifact
1193
+ name: MTEB SciFact
1194
+ config: default
1195
+ split: test
1196
+ revision: None
1197
+ metrics:
1198
+ - type: ndcg_at_1
1199
+ value: 63.333
1200
+ - type: ndcg_at_3
1201
+ value: 72.768
1202
+ - type: ndcg_at_5
1203
+ value: 75.124
1204
+ - type: ndcg_at_10
1205
+ value: 77.178
1206
+ - type: ndcg_at_100
1207
+ value: 78.769
1208
+ - type: map_at_1
1209
+ value: 60.9
1210
+ - type: map_at_3
1211
+ value: 69.69999999999999
1212
+ - type: map_at_5
1213
+ value: 71.345
1214
+ - type: map_at_10
1215
+ value: 72.36200000000001
1216
+ - type: map_at_100
1217
+ value: 72.783
1218
+ - type: recall_at_1
1219
+ value: 60.9
1220
+ - type: recall_at_3
1221
+ value: 79.172
1222
+ - type: recall_at_5
1223
+ value: 84.917
1224
+ - type: recall_at_10
1225
+ value: 90.756
1226
+ - type: recall_at_100
1227
+ value: 97.667
1228
+ - type: precision_at_1
1229
+ value: 63.333
1230
+ - type: precision_at_3
1231
+ value: 28.555999999999997
1232
+ - type: precision_at_5
1233
+ value: 18.8
1234
+ - type: precision_at_10
1235
+ value: 10.233
1236
+ - type: precision_at_100
1237
+ value: 1.107
1238
+ - type: mrr_at_1
1239
+ value: 63.333
1240
+ - type: mrr_at_3
1241
+ value: 71.27799999999999
1242
+ - type: mrr_at_5
1243
+ value: 72.478
1244
+ - type: mrr_at_10
1245
+ value: 73.163
1246
+ - type: mrr_at_100
1247
+ value: 73.457
1248
+ - task:
1249
+ type: PairClassification
1250
+ dataset:
1251
+ type: mteb/sprintduplicatequestions-pairclassification
1252
+ name: MTEB SprintDuplicateQuestions
1253
+ config: default
1254
+ split: test
1255
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
1256
+ metrics:
1257
+ - type: cos_sim_accuracy
1258
+ value: 99.8009900990099
1259
+ - type: cos_sim_ap
1260
+ value: 95.46920445134404
1261
+ - type: cos_sim_f1
1262
+ value: 89.70814132104455
1263
+ - type: cos_sim_precision
1264
+ value: 91.9202518363064
1265
+ - type: cos_sim_recall
1266
+ value: 87.6
1267
+ - type: dot_accuracy
1268
+ value: 99.8009900990099
1269
+ - type: dot_ap
1270
+ value: 95.46920445134404
1271
+ - type: dot_f1
1272
+ value: 89.70814132104455
1273
+ - type: dot_precision
1274
+ value: 91.9202518363064
1275
+ - type: dot_recall
1276
+ value: 87.6
1277
+ - type: euclidean_accuracy
1278
+ value: 99.8009900990099
1279
+ - type: euclidean_ap
1280
+ value: 95.46924273007079
1281
+ - type: euclidean_f1
1282
+ value: 89.70814132104455
1283
+ - type: euclidean_precision
1284
+ value: 91.9202518363064
1285
+ - type: euclidean_recall
1286
+ value: 87.6
1287
+ - type: manhattan_accuracy
1288
+ value: 99.81188118811882
1289
+ - type: manhattan_ap
1290
+ value: 95.77631677784113
1291
+ - type: manhattan_f1
1292
+ value: 90.26639344262296
1293
+ - type: manhattan_precision
1294
+ value: 92.5420168067227
1295
+ - type: manhattan_recall
1296
+ value: 88.1
1297
+ - type: max_accuracy
1298
+ value: 99.81188118811882
1299
+ - type: max_ap
1300
+ value: 95.77631677784113
1301
+ - type: max_f1
1302
+ value: 90.26639344262296
1303
+ - task:
1304
+ type: Clustering
1305
+ dataset:
1306
+ type: mteb/stackexchange-clustering
1307
+ name: MTEB StackExchangeClustering
1308
+ config: default
1309
+ split: test
1310
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
1311
+ metrics:
1312
+ - type: v_measure
1313
+ value: 71.59238280333025
1314
+ - task:
1315
+ type: Clustering
1316
+ dataset:
1317
+ type: mteb/stackexchange-clustering-p2p
1318
+ name: MTEB StackExchangeClusteringP2P
1319
+ config: default
1320
+ split: test
1321
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
1322
+ metrics:
1323
+ - type: v_measure
1324
+ value: 39.012562075214035
1325
+ - task:
1326
+ type: Reranking
1327
+ dataset:
1328
+ type: mteb/stackoverflowdupquestions-reranking
1329
+ name: MTEB StackOverflowDupQuestions
1330
+ config: default
1331
+ split: test
1332
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
1333
+ metrics:
1334
+ - type: map
1335
+ value: 55.16521497700657
1336
+ - type: mrr
1337
+ value: 56.1779427680163
1338
+ - task:
1339
+ type: Summarization
1340
+ dataset:
1341
+ type: mteb/summeval
1342
+ name: MTEB SummEval
1343
+ config: default
1344
+ split: test
1345
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
1346
+ metrics:
1347
+ - type: cos_sim_pearson
1348
+ value: 31.04402552863106
1349
+ - type: cos_sim_spearman
1350
+ value: 31.05558230938988
1351
+ - type: dot_pearson
1352
+ value: 31.04400838015153
1353
+ - type: dot_spearman
1354
+ value: 31.05558230938988
1355
+ - task:
1356
+ type: Retrieval
1357
+ dataset:
1358
+ type: trec-covid
1359
+ name: MTEB TRECCOVID
1360
+ config: default
1361
+ split: test
1362
+ revision: None
1363
+ metrics:
1364
+ - type: ndcg_at_1
1365
+ value: 91.0
1366
+ - type: ndcg_at_3
1367
+ value: 92.34599999999999
1368
+ - type: ndcg_at_5
1369
+ value: 90.89399999999999
1370
+ - type: ndcg_at_10
1371
+ value: 87.433
1372
+ - type: ndcg_at_100
1373
+ value: 67.06400000000001
1374
+ - type: map_at_1
1375
+ value: 0.241
1376
+ - type: map_at_3
1377
+ value: 0.735
1378
+ - type: map_at_5
1379
+ value: 1.216
1380
+ - type: map_at_10
1381
+ value: 2.317
1382
+ - type: map_at_100
1383
+ value: 14.151
1384
+ - type: recall_at_1
1385
+ value: 0.241
1386
+ - type: recall_at_3
1387
+ value: 0.76
1388
+ - type: recall_at_5
1389
+ value: 1.254
1390
+ - type: recall_at_10
1391
+ value: 2.421
1392
+ - type: recall_at_100
1393
+ value: 16.715
1394
+ - type: precision_at_1
1395
+ value: 94.0
1396
+ - type: precision_at_3
1397
+ value: 96.0
1398
+ - type: precision_at_5
1399
+ value: 94.8
1400
+ - type: precision_at_10
1401
+ value: 91.4
1402
+ - type: precision_at_100
1403
+ value: 68.24
1404
+ - type: mrr_at_1
1405
+ value: 94.0
1406
+ - type: mrr_at_3
1407
+ value: 96.667
1408
+ - type: mrr_at_5
1409
+ value: 96.667
1410
+ - type: mrr_at_10
1411
+ value: 96.667
1412
+ - type: mrr_at_100
1413
+ value: 96.667
1414
+ - task:
1415
+ type: Retrieval
1416
+ dataset:
1417
+ type: webis-touche2020
1418
+ name: MTEB Touche2020
1419
+ config: default
1420
+ split: test
1421
+ revision: None
1422
+ metrics:
1423
+ - type: ndcg_at_1
1424
+ value: 26.531
1425
+ - type: ndcg_at_3
1426
+ value: 27.728
1427
+ - type: ndcg_at_5
1428
+ value: 25.668000000000003
1429
+ - type: ndcg_at_10
1430
+ value: 25.785999999999998
1431
+ - type: ndcg_at_100
1432
+ value: 35.623
1433
+ - type: map_at_1
1434
+ value: 2.076
1435
+ - type: map_at_3
1436
+ value: 5.29
1437
+ - type: map_at_5
1438
+ value: 7.292999999999999
1439
+ - type: map_at_10
1440
+ value: 9.81
1441
+ - type: map_at_100
1442
+ value: 15.461
1443
+ - type: recall_at_1
1444
+ value: 2.076
1445
+ - type: recall_at_3
1446
+ value: 6.7250000000000005
1447
+ - type: recall_at_5
1448
+ value: 9.808
1449
+ - type: recall_at_10
1450
+ value: 16.467000000000002
1451
+ - type: recall_at_100
1452
+ value: 45.109
1453
+ - type: precision_at_1
1454
+ value: 28.571
1455
+ - type: precision_at_3
1456
+ value: 29.252
1457
+ - type: precision_at_5
1458
+ value: 25.714
1459
+ - type: precision_at_10
1460
+ value: 23.265
1461
+ - type: precision_at_100
1462
+ value: 7.184
1463
+ - type: mrr_at_1
1464
+ value: 28.571
1465
+ - type: mrr_at_3
1466
+ value: 42.857
1467
+ - type: mrr_at_5
1468
+ value: 44.184
1469
+ - type: mrr_at_10
1470
+ value: 47.564
1471
+ - type: mrr_at_100
1472
+ value: 48.142
1473
+ - task:
1474
+ type: Classification
1475
+ dataset:
1476
+ type: mteb/toxic_conversations_50k
1477
+ name: MTEB ToxicConversationsClassification
1478
+ config: default
1479
+ split: test
1480
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
1481
+ metrics:
1482
+ - type: accuracy
1483
+ value: 68.43159999999999
1484
+ - type: ap
1485
+ value: 14.08119146524032
1486
+ - type: f1
1487
+ value: 53.26032318755336
1488
+ - task:
1489
+ type: Classification
1490
+ dataset:
1491
+ type: mteb/tweet_sentiment_extraction
1492
+ name: MTEB TweetSentimentExtractionClassification
1493
+ config: default
1494
+ split: test
1495
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
1496
+ metrics:
1497
+ - type: accuracy
1498
+ value: 63.82852292020373
1499
+ - type: f1
1500
+ value: 64.14509521870399
1501
+ - task:
1502
+ type: Clustering
1503
+ dataset:
1504
+ type: mteb/twentynewsgroups-clustering
1505
+ name: MTEB TwentyNewsgroupsClustering
1506
+ config: default
1507
+ split: test
1508
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
1509
+ metrics:
1510
+ - type: v_measure
1511
+ value: 55.252554461698566
1512
+ - task:
1513
+ type: PairClassification
1514
+ dataset:
1515
+ type: mteb/twittersemeval2015-pairclassification
1516
+ name: MTEB TwitterSemEval2015
1517
+ config: default
1518
+ split: test
1519
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
1520
+ metrics:
1521
+ - type: cos_sim_accuracy
1522
+ value: 88.54383978065208
1523
+ - type: cos_sim_ap
1524
+ value: 81.67495128150328
1525
+ - type: cos_sim_f1
1526
+ value: 74.58161532864419
1527
+ - type: cos_sim_precision
1528
+ value: 69.00807899461401
1529
+ - type: cos_sim_recall
1530
+ value: 81.13456464379946
1531
+ - type: dot_accuracy
1532
+ value: 88.54383978065208
1533
+ - type: dot_ap
1534
+ value: 81.6748330747088
1535
+ - type: dot_f1
1536
+ value: 74.58161532864419
1537
+ - type: dot_precision
1538
+ value: 69.00807899461401
1539
+ - type: dot_recall
1540
+ value: 81.13456464379946
1541
+ - type: euclidean_accuracy
1542
+ value: 88.54383978065208
1543
+ - type: euclidean_ap
1544
+ value: 81.67496006818212
1545
+ - type: euclidean_f1
1546
+ value: 74.58161532864419
1547
+ - type: euclidean_precision
1548
+ value: 69.00807899461401
1549
+ - type: euclidean_recall
1550
+ value: 81.13456464379946
1551
+ - type: manhattan_accuracy
1552
+ value: 88.40674733265782
1553
+ - type: manhattan_ap
1554
+ value: 81.56036996969941
1555
+ - type: manhattan_f1
1556
+ value: 74.33063129452223
1557
+ - type: manhattan_precision
1558
+ value: 69.53125
1559
+ - type: manhattan_recall
1560
+ value: 79.84168865435356
1561
+ - type: max_accuracy
1562
+ value: 88.54383978065208
1563
+ - type: max_ap
1564
+ value: 81.67496006818212
1565
+ - type: max_f1
1566
+ value: 74.58161532864419
1567
+ - task:
1568
+ type: PairClassification
1569
+ dataset:
1570
+ type: mteb/twitterurlcorpus-pairclassification
1571
+ name: MTEB TwitterURLCorpus
1572
+ config: default
1573
+ split: test
1574
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
1575
+ metrics:
1576
+ - type: cos_sim_accuracy
1577
+ value: 89.75627740908915
1578
+ - type: cos_sim_ap
1579
+ value: 87.41911504007292
1580
+ - type: cos_sim_f1
1581
+ value: 79.91742008969888
1582
+ - type: cos_sim_precision
1583
+ value: 74.31484178472131
1584
+ - type: cos_sim_recall
1585
+ value: 86.43363104404065
1586
+ - type: dot_accuracy
1587
+ value: 89.75627740908915
1588
+ - type: dot_ap
1589
+ value: 87.41910845717851
1590
+ - type: dot_f1
1591
+ value: 79.91742008969888
1592
+ - type: dot_precision
1593
+ value: 74.31484178472131
1594
+ - type: dot_recall
1595
+ value: 86.43363104404065
1596
+ - type: euclidean_accuracy
1597
+ value: 89.75627740908915
1598
+ - type: euclidean_ap
1599
+ value: 87.41912150448005
1600
+ - type: euclidean_f1
1601
+ value: 79.91742008969888
1602
+ - type: euclidean_precision
1603
+ value: 74.31484178472131
1604
+ - type: euclidean_recall
1605
+ value: 86.43363104404065
1606
+ - type: manhattan_accuracy
1607
+ value: 89.76597974152986
1608
+ - type: manhattan_ap
1609
+ value: 87.49835162128704
1610
+ - type: manhattan_f1
1611
+ value: 80.05401656994779
1612
+ - type: manhattan_precision
1613
+ value: 76.10158906390951
1614
+ - type: manhattan_recall
1615
+ value: 84.43948259932245
1616
+ - type: max_accuracy
1617
+ value: 89.76597974152986
1618
+ - type: max_ap
1619
+ value: 87.49835162128704
1620
+ - type: max_f1
1621
+ value: 80.05401656994779
1622
+ language:
1623
+ - en
1624
  license: mit
1625
  ---
1626
+
1627
+ ## SPEED-embedding-7b-instruct
1628
+
1629
+ [Little Giants: Synthesizing High-Quality Embedding Data at Scale](https://arxiv.org/pdf/2410.18634.pdf). Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, Zhicheng Dou, arXiv 2024
1630
+
1631
+ This model has 32 layers and the embedding size is 4096.
1632
+
1633
+ ## Usage
1634
+
1635
+ Below is an example to encode queries and passages from the MS-MARCO passage ranking dataset.
1636
+
1637
+ ### Transformers
1638
+
1639
+ ```python
1640
+ import torch
1641
+ import torch.nn.functional as F
1642
+
1643
+ from torch import Tensor
1644
+ from transformers import AutoTokenizer, AutoModel
1645
+
1646
+
1647
+ def last_token_pool(last_hidden_states: Tensor,
1648
+ attention_mask: Tensor) -> Tensor:
1649
+ left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
1650
+ if left_padding:
1651
+ return last_hidden_states[:, -1]
1652
+ else:
1653
+ sequence_lengths = attention_mask.sum(dim=1) - 1
1654
+ batch_size = last_hidden_states.shape[0]
1655
+ return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
1656
+
1657
+
1658
+ def get_detailed_instruct(task_description: str, query: str) -> str:
1659
+ return f'Instruct: {task_description}\nQuery: {query}'
1660
+
1661
+
1662
+ # Each query must come with a one-sentence instruction that describes the task
1663
+ task = 'Given a web search query, retrieve relevant passages that answer the query'
1664
+ queries = [
1665
+ get_detailed_instruct(task, 'how much protein should a female eat'),
1666
+ get_detailed_instruct(task, 'summit define')
1667
+ ]
1668
+ # No need to add instruction for retrieval documents
1669
+ documents = [
1670
+ "As a general guideline, the CDC's average requirement of protein for women ages 19 to 70 is 46 grams per day. But, as you can see from this chart, you'll need to increase that if you're expecting or training for a marathon. Check out the chart below to see how much protein you should be eating each day.",
1671
+ "Definition of summit for English Language Learners. : 1 the highest point of a mountain : the top of a mountain. : 2 the highest level. : 3 a meeting or series of meetings between the leaders of two or more governments."
1672
+ ]
1673
+ input_texts = queries + documents
1674
+
1675
+ tokenizer = AutoTokenizer.from_pretrained('Haon-Chen/speed-embedding-7b-instruct')
1676
+ model = AutoModel.from_pretrained('Haon-Chen/speed-embedding-7b-instruct')
1677
+
1678
+ max_length = 4096
1679
+ # Tokenize the input texts
1680
+ batch_dict = tokenizer(input_texts, max_length=max_length, padding=True, truncation=True, return_tensors='pt')
1681
+
1682
+ outputs = model(**batch_dict)
1683
+ embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
1684
+
1685
+ # normalize embeddings
1686
+ embeddings = F.normalize(embeddings, p=2, dim=1)
1687
+ scores = (embeddings[:2] @ embeddings[2:].T) * 100
1688
+ print(scores.tolist())
1689
+ ```
1690
+
1691
+ ## MTEB Benchmark Evaluation
1692
+
1693
+ Check out [unilm/e5](https://github.com/microsoft/unilm/tree/master/e5) to reproduce evaluation results
1694
+ on the [BEIR](https://arxiv.org/abs/2104.08663) and [MTEB benchmark](https://arxiv.org/abs/2210.07316).
1695
+
1696
+ ## FAQ
1697
+
1698
+ **1. Do I need to add instructions to the query?**
1699
+
1700
+ Yes, this is how the model is trained, otherwise you will see a performance degradation.
1701
+ The task definition should be a one-sentence instruction that describes the task.
1702
+ This is a way to customize text embeddings for different scenarios through natural language instructions.
1703
+
1704
+ Please check out [unilm/e5/utils.py](https://github.com/microsoft/unilm/blob/9c0f1ff7ca53431fe47d2637dfe253643d94185b/e5/utils.py#L106) for instructions we used for evaluation.
1705
+
1706
+ On the other hand, there is no need to add instructions to the document side.
1707
+
1708
+ **2. Why are my reproduced results slightly different from reported in the model card?**
1709
+
1710
+ Different versions of `transformers` and `pytorch` could cause negligible but non-zero performance differences.
1711
+
1712
+ **3. Where are the LoRA-only weights?**
1713
+
1714
+ You can find the LoRA-only weights at [https://huggingface.co/Haon-Chen/speed-embedding-7b-instruct/tree/main/lora](https://huggingface.co/Haon-Chen/speed-embedding-7b-instruct/tree/main/lora).
1715
+
1716
+ ## Citation
1717
+
1718
+ If you find our paper or models helpful, please consider cite as follows:
1719
+
1720
+ ```bibtex
1721
+ @article{chen2024little,
1722
+ title={Little Giants: Synthesizing High-Quality Embedding Data at Scale},
1723
+ author={Chen, Haonan and Wang, Liang and Yang, Nan and Zhu, Yutao and Zhao, Ziliang and Wei, Furu and Dou, Zhicheng},
1724
+ journal={arXiv preprint arXiv:2410.18634},
1725
+ year={2024}
1726
+ }
1727
+ ```
1728
+
1729
+ ## Limitations
1730
+
1731
+ Using this model for inputs longer than 4096 tokens is not recommended.