infgrad commited on
Commit
c2d4fc7
1 Parent(s): b8e19c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1209 -1
README.md CHANGED
@@ -1,3 +1,1211 @@
1
  ---
2
- license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+ - mteb
8
+ model-index:
9
+ - name: stella-large-zh-v3-1792d
10
+ results:
11
+ - task:
12
+ type: STS
13
+ dataset:
14
+ type: C-MTEB/AFQMC
15
+ name: MTEB AFQMC
16
+ config: default
17
+ split: validation
18
+ revision: None
19
+ metrics:
20
+ - type: cos_sim_pearson
21
+ value: 54.48093298255762
22
+ - type: cos_sim_spearman
23
+ value: 59.105354109068685
24
+ - type: euclidean_pearson
25
+ value: 57.761189988643444
26
+ - type: euclidean_spearman
27
+ value: 59.10537421115596
28
+ - type: manhattan_pearson
29
+ value: 56.94359297051431
30
+ - type: manhattan_spearman
31
+ value: 58.37611109821567
32
+ - task:
33
+ type: STS
34
+ dataset:
35
+ type: C-MTEB/ATEC
36
+ name: MTEB ATEC
37
+ config: default
38
+ split: test
39
+ revision: None
40
+ metrics:
41
+ - type: cos_sim_pearson
42
+ value: 54.39711127600595
43
+ - type: cos_sim_spearman
44
+ value: 58.190191920824454
45
+ - type: euclidean_pearson
46
+ value: 61.80082379352729
47
+ - type: euclidean_spearman
48
+ value: 58.19018966860797
49
+ - type: manhattan_pearson
50
+ value: 60.927601060396206
51
+ - type: manhattan_spearman
52
+ value: 57.78832902694192
53
+ - task:
54
+ type: Classification
55
+ dataset:
56
+ type: mteb/amazon_reviews_multi
57
+ name: MTEB AmazonReviewsClassification (zh)
58
+ config: zh
59
+ split: test
60
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
61
+ metrics:
62
+ - type: accuracy
63
+ value: 46.31600000000001
64
+ - type: f1
65
+ value: 44.45281663598873
66
+ - task:
67
+ type: STS
68
+ dataset:
69
+ type: C-MTEB/BQ
70
+ name: MTEB BQ
71
+ config: default
72
+ split: test
73
+ revision: None
74
+ metrics:
75
+ - type: cos_sim_pearson
76
+ value: 69.12211326097868
77
+ - type: cos_sim_spearman
78
+ value: 71.0741302039443
79
+ - type: euclidean_pearson
80
+ value: 69.89070483887852
81
+ - type: euclidean_spearman
82
+ value: 71.07413020351787
83
+ - type: manhattan_pearson
84
+ value: 69.62345441260962
85
+ - type: manhattan_spearman
86
+ value: 70.8517591280618
87
+ - task:
88
+ type: Clustering
89
+ dataset:
90
+ type: C-MTEB/CLSClusteringP2P
91
+ name: MTEB CLSClusteringP2P
92
+ config: default
93
+ split: test
94
+ revision: None
95
+ metrics:
96
+ - type: v_measure
97
+ value: 41.937723608805314
98
+ - task:
99
+ type: Clustering
100
+ dataset:
101
+ type: C-MTEB/CLSClusteringS2S
102
+ name: MTEB CLSClusteringS2S
103
+ config: default
104
+ split: test
105
+ revision: None
106
+ metrics:
107
+ - type: v_measure
108
+ value: 40.34373057675427
109
+ - task:
110
+ type: Reranking
111
+ dataset:
112
+ type: C-MTEB/CMedQAv1-reranking
113
+ name: MTEB CMedQAv1
114
+ config: default
115
+ split: test
116
+ revision: None
117
+ metrics:
118
+ - type: map
119
+ value: 88.98896401788376
120
+ - type: mrr
121
+ value: 90.97119047619047
122
+ - task:
123
+ type: Reranking
124
+ dataset:
125
+ type: C-MTEB/CMedQAv2-reranking
126
+ name: MTEB CMedQAv2
127
+ config: default
128
+ split: test
129
+ revision: None
130
+ metrics:
131
+ - type: map
132
+ value: 89.59718540244556
133
+ - type: mrr
134
+ value: 91.41246031746032
135
+ - task:
136
+ type: Retrieval
137
+ dataset:
138
+ type: C-MTEB/CmedqaRetrieval
139
+ name: MTEB CmedqaRetrieval
140
+ config: default
141
+ split: dev
142
+ revision: None
143
+ metrics:
144
+ - type: map_at_1
145
+ value: 26.954
146
+ - type: map_at_10
147
+ value: 40.144999999999996
148
+ - type: map_at_100
149
+ value: 42.083999999999996
150
+ - type: map_at_1000
151
+ value: 42.181000000000004
152
+ - type: map_at_3
153
+ value: 35.709
154
+ - type: map_at_5
155
+ value: 38.141000000000005
156
+ - type: mrr_at_1
157
+ value: 40.71
158
+ - type: mrr_at_10
159
+ value: 48.93
160
+ - type: mrr_at_100
161
+ value: 49.921
162
+ - type: mrr_at_1000
163
+ value: 49.958999999999996
164
+ - type: mrr_at_3
165
+ value: 46.32
166
+ - type: mrr_at_5
167
+ value: 47.769
168
+ - type: ndcg_at_1
169
+ value: 40.71
170
+ - type: ndcg_at_10
171
+ value: 46.869
172
+ - type: ndcg_at_100
173
+ value: 54.234
174
+ - type: ndcg_at_1000
175
+ value: 55.854000000000006
176
+ - type: ndcg_at_3
177
+ value: 41.339
178
+ - type: ndcg_at_5
179
+ value: 43.594
180
+ - type: precision_at_1
181
+ value: 40.71
182
+ - type: precision_at_10
183
+ value: 10.408000000000001
184
+ - type: precision_at_100
185
+ value: 1.635
186
+ - type: precision_at_1000
187
+ value: 0.184
188
+ - type: precision_at_3
189
+ value: 23.348
190
+ - type: precision_at_5
191
+ value: 16.929
192
+ - type: recall_at_1
193
+ value: 26.954
194
+ - type: recall_at_10
195
+ value: 57.821999999999996
196
+ - type: recall_at_100
197
+ value: 88.08200000000001
198
+ - type: recall_at_1000
199
+ value: 98.83800000000001
200
+ - type: recall_at_3
201
+ value: 41.221999999999994
202
+ - type: recall_at_5
203
+ value: 48.241
204
+ - task:
205
+ type: PairClassification
206
+ dataset:
207
+ type: C-MTEB/CMNLI
208
+ name: MTEB Cmnli
209
+ config: default
210
+ split: validation
211
+ revision: None
212
+ metrics:
213
+ - type: cos_sim_accuracy
214
+ value: 83.6680697534576
215
+ - type: cos_sim_ap
216
+ value: 90.77401562455269
217
+ - type: cos_sim_f1
218
+ value: 84.68266427450101
219
+ - type: cos_sim_precision
220
+ value: 81.36177547942253
221
+ - type: cos_sim_recall
222
+ value: 88.28618190320317
223
+ - type: dot_accuracy
224
+ value: 83.6680697534576
225
+ - type: dot_ap
226
+ value: 90.76429465198817
227
+ - type: dot_f1
228
+ value: 84.68266427450101
229
+ - type: dot_precision
230
+ value: 81.36177547942253
231
+ - type: dot_recall
232
+ value: 88.28618190320317
233
+ - type: euclidean_accuracy
234
+ value: 83.6680697534576
235
+ - type: euclidean_ap
236
+ value: 90.77401909305344
237
+ - type: euclidean_f1
238
+ value: 84.68266427450101
239
+ - type: euclidean_precision
240
+ value: 81.36177547942253
241
+ - type: euclidean_recall
242
+ value: 88.28618190320317
243
+ - type: manhattan_accuracy
244
+ value: 83.40348767288035
245
+ - type: manhattan_ap
246
+ value: 90.57002020310819
247
+ - type: manhattan_f1
248
+ value: 84.51526032315978
249
+ - type: manhattan_precision
250
+ value: 81.25134843581445
251
+ - type: manhattan_recall
252
+ value: 88.05237315875614
253
+ - type: max_accuracy
254
+ value: 83.6680697534576
255
+ - type: max_ap
256
+ value: 90.77401909305344
257
+ - type: max_f1
258
+ value: 84.68266427450101
259
+ - task:
260
+ type: Retrieval
261
+ dataset:
262
+ type: C-MTEB/CovidRetrieval
263
+ name: MTEB CovidRetrieval
264
+ config: default
265
+ split: dev
266
+ revision: None
267
+ metrics:
268
+ - type: map_at_1
269
+ value: 69.705
270
+ - type: map_at_10
271
+ value: 78.648
272
+ - type: map_at_100
273
+ value: 78.888
274
+ - type: map_at_1000
275
+ value: 78.89399999999999
276
+ - type: map_at_3
277
+ value: 77.151
278
+ - type: map_at_5
279
+ value: 77.98
280
+ - type: mrr_at_1
281
+ value: 69.863
282
+ - type: mrr_at_10
283
+ value: 78.62599999999999
284
+ - type: mrr_at_100
285
+ value: 78.861
286
+ - type: mrr_at_1000
287
+ value: 78.867
288
+ - type: mrr_at_3
289
+ value: 77.204
290
+ - type: mrr_at_5
291
+ value: 78.005
292
+ - type: ndcg_at_1
293
+ value: 69.968
294
+ - type: ndcg_at_10
295
+ value: 82.44399999999999
296
+ - type: ndcg_at_100
297
+ value: 83.499
298
+ - type: ndcg_at_1000
299
+ value: 83.647
300
+ - type: ndcg_at_3
301
+ value: 79.393
302
+ - type: ndcg_at_5
303
+ value: 80.855
304
+ - type: precision_at_1
305
+ value: 69.968
306
+ - type: precision_at_10
307
+ value: 9.515
308
+ - type: precision_at_100
309
+ value: 0.9990000000000001
310
+ - type: precision_at_1000
311
+ value: 0.101
312
+ - type: precision_at_3
313
+ value: 28.802
314
+ - type: precision_at_5
315
+ value: 18.019
316
+ - type: recall_at_1
317
+ value: 69.705
318
+ - type: recall_at_10
319
+ value: 94.152
320
+ - type: recall_at_100
321
+ value: 98.84100000000001
322
+ - type: recall_at_1000
323
+ value: 100.0
324
+ - type: recall_at_3
325
+ value: 85.774
326
+ - type: recall_at_5
327
+ value: 89.252
328
+ - task:
329
+ type: Retrieval
330
+ dataset:
331
+ type: C-MTEB/DuRetrieval
332
+ name: MTEB DuRetrieval
333
+ config: default
334
+ split: dev
335
+ revision: None
336
+ metrics:
337
+ - type: map_at_1
338
+ value: 25.88
339
+ - type: map_at_10
340
+ value: 79.857
341
+ - type: map_at_100
342
+ value: 82.636
343
+ - type: map_at_1000
344
+ value: 82.672
345
+ - type: map_at_3
346
+ value: 55.184
347
+ - type: map_at_5
348
+ value: 70.009
349
+ - type: mrr_at_1
350
+ value: 89.64999999999999
351
+ - type: mrr_at_10
352
+ value: 92.967
353
+ - type: mrr_at_100
354
+ value: 93.039
355
+ - type: mrr_at_1000
356
+ value: 93.041
357
+ - type: mrr_at_3
358
+ value: 92.65
359
+ - type: mrr_at_5
360
+ value: 92.86
361
+ - type: ndcg_at_1
362
+ value: 89.64999999999999
363
+ - type: ndcg_at_10
364
+ value: 87.126
365
+ - type: ndcg_at_100
366
+ value: 89.898
367
+ - type: ndcg_at_1000
368
+ value: 90.253
369
+ - type: ndcg_at_3
370
+ value: 86.012
371
+ - type: ndcg_at_5
372
+ value: 85.124
373
+ - type: precision_at_1
374
+ value: 89.64999999999999
375
+ - type: precision_at_10
376
+ value: 41.735
377
+ - type: precision_at_100
378
+ value: 4.797
379
+ - type: precision_at_1000
380
+ value: 0.488
381
+ - type: precision_at_3
382
+ value: 77.267
383
+ - type: precision_at_5
384
+ value: 65.48
385
+ - type: recall_at_1
386
+ value: 25.88
387
+ - type: recall_at_10
388
+ value: 88.28399999999999
389
+ - type: recall_at_100
390
+ value: 97.407
391
+ - type: recall_at_1000
392
+ value: 99.29299999999999
393
+ - type: recall_at_3
394
+ value: 57.38799999999999
395
+ - type: recall_at_5
396
+ value: 74.736
397
+ - task:
398
+ type: Retrieval
399
+ dataset:
400
+ type: C-MTEB/EcomRetrieval
401
+ name: MTEB EcomRetrieval
402
+ config: default
403
+ split: dev
404
+ revision: None
405
+ metrics:
406
+ - type: map_at_1
407
+ value: 53.2
408
+ - type: map_at_10
409
+ value: 63.556000000000004
410
+ - type: map_at_100
411
+ value: 64.033
412
+ - type: map_at_1000
413
+ value: 64.044
414
+ - type: map_at_3
415
+ value: 60.983
416
+ - type: map_at_5
417
+ value: 62.588
418
+ - type: mrr_at_1
419
+ value: 53.2
420
+ - type: mrr_at_10
421
+ value: 63.556000000000004
422
+ - type: mrr_at_100
423
+ value: 64.033
424
+ - type: mrr_at_1000
425
+ value: 64.044
426
+ - type: mrr_at_3
427
+ value: 60.983
428
+ - type: mrr_at_5
429
+ value: 62.588
430
+ - type: ndcg_at_1
431
+ value: 53.2
432
+ - type: ndcg_at_10
433
+ value: 68.61699999999999
434
+ - type: ndcg_at_100
435
+ value: 70.88499999999999
436
+ - type: ndcg_at_1000
437
+ value: 71.15899999999999
438
+ - type: ndcg_at_3
439
+ value: 63.434000000000005
440
+ - type: ndcg_at_5
441
+ value: 66.301
442
+ - type: precision_at_1
443
+ value: 53.2
444
+ - type: precision_at_10
445
+ value: 8.450000000000001
446
+ - type: precision_at_100
447
+ value: 0.95
448
+ - type: precision_at_1000
449
+ value: 0.097
450
+ - type: precision_at_3
451
+ value: 23.5
452
+ - type: precision_at_5
453
+ value: 15.479999999999999
454
+ - type: recall_at_1
455
+ value: 53.2
456
+ - type: recall_at_10
457
+ value: 84.5
458
+ - type: recall_at_100
459
+ value: 95.0
460
+ - type: recall_at_1000
461
+ value: 97.1
462
+ - type: recall_at_3
463
+ value: 70.5
464
+ - type: recall_at_5
465
+ value: 77.4
466
+ - task:
467
+ type: Classification
468
+ dataset:
469
+ type: C-MTEB/IFlyTek-classification
470
+ name: MTEB IFlyTek
471
+ config: default
472
+ split: validation
473
+ revision: None
474
+ metrics:
475
+ - type: accuracy
476
+ value: 50.63485956136976
477
+ - type: f1
478
+ value: 38.286307407751266
479
+ - task:
480
+ type: Classification
481
+ dataset:
482
+ type: C-MTEB/JDReview-classification
483
+ name: MTEB JDReview
484
+ config: default
485
+ split: test
486
+ revision: None
487
+ metrics:
488
+ - type: accuracy
489
+ value: 86.11632270168855
490
+ - type: ap
491
+ value: 54.43932599806482
492
+ - type: f1
493
+ value: 80.85485110996076
494
+ - task:
495
+ type: STS
496
+ dataset:
497
+ type: C-MTEB/LCQMC
498
+ name: MTEB LCQMC
499
+ config: default
500
+ split: test
501
+ revision: None
502
+ metrics:
503
+ - type: cos_sim_pearson
504
+ value: 72.47315152994804
505
+ - type: cos_sim_spearman
506
+ value: 78.26531600908152
507
+ - type: euclidean_pearson
508
+ value: 77.8560788714531
509
+ - type: euclidean_spearman
510
+ value: 78.26531157334841
511
+ - type: manhattan_pearson
512
+ value: 77.70593783974188
513
+ - type: manhattan_spearman
514
+ value: 78.13880812439999
515
+ - task:
516
+ type: Reranking
517
+ dataset:
518
+ type: C-MTEB/Mmarco-reranking
519
+ name: MTEB MMarcoReranking
520
+ config: default
521
+ split: dev
522
+ revision: None
523
+ metrics:
524
+ - type: map
525
+ value: 28.088177976572222
526
+ - type: mrr
527
+ value: 27.125
528
+ - task:
529
+ type: Retrieval
530
+ dataset:
531
+ type: C-MTEB/MMarcoRetrieval
532
+ name: MTEB MMarcoRetrieval
533
+ config: default
534
+ split: dev
535
+ revision: None
536
+ metrics:
537
+ - type: map_at_1
538
+ value: 66.428
539
+ - type: map_at_10
540
+ value: 75.5
541
+ - type: map_at_100
542
+ value: 75.82600000000001
543
+ - type: map_at_1000
544
+ value: 75.837
545
+ - type: map_at_3
546
+ value: 73.74300000000001
547
+ - type: map_at_5
548
+ value: 74.87
549
+ - type: mrr_at_1
550
+ value: 68.754
551
+ - type: mrr_at_10
552
+ value: 76.145
553
+ - type: mrr_at_100
554
+ value: 76.432
555
+ - type: mrr_at_1000
556
+ value: 76.442
557
+ - type: mrr_at_3
558
+ value: 74.628
559
+ - type: mrr_at_5
560
+ value: 75.612
561
+ - type: ndcg_at_1
562
+ value: 68.754
563
+ - type: ndcg_at_10
564
+ value: 79.144
565
+ - type: ndcg_at_100
566
+ value: 80.60199999999999
567
+ - type: ndcg_at_1000
568
+ value: 80.886
569
+ - type: ndcg_at_3
570
+ value: 75.81599999999999
571
+ - type: ndcg_at_5
572
+ value: 77.729
573
+ - type: precision_at_1
574
+ value: 68.754
575
+ - type: precision_at_10
576
+ value: 9.544
577
+ - type: precision_at_100
578
+ value: 1.026
579
+ - type: precision_at_1000
580
+ value: 0.105
581
+ - type: precision_at_3
582
+ value: 28.534
583
+ - type: precision_at_5
584
+ value: 18.138
585
+ - type: recall_at_1
586
+ value: 66.428
587
+ - type: recall_at_10
588
+ value: 89.716
589
+ - type: recall_at_100
590
+ value: 96.313
591
+ - type: recall_at_1000
592
+ value: 98.541
593
+ - type: recall_at_3
594
+ value: 80.923
595
+ - type: recall_at_5
596
+ value: 85.48
597
+ - task:
598
+ type: Classification
599
+ dataset:
600
+ type: mteb/amazon_massive_intent
601
+ name: MTEB MassiveIntentClassification (zh-CN)
602
+ config: zh-CN
603
+ split: test
604
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
605
+ metrics:
606
+ - type: accuracy
607
+ value: 73.27841291190316
608
+ - type: f1
609
+ value: 70.65529957574735
610
+ - task:
611
+ type: Classification
612
+ dataset:
613
+ type: mteb/amazon_massive_scenario
614
+ name: MTEB MassiveScenarioClassification (zh-CN)
615
+ config: zh-CN
616
+ split: test
617
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
618
+ metrics:
619
+ - type: accuracy
620
+ value: 76.30127774041695
621
+ - type: f1
622
+ value: 76.10358226518304
623
+ - task:
624
+ type: Retrieval
625
+ dataset:
626
+ type: C-MTEB/MedicalRetrieval
627
+ name: MTEB MedicalRetrieval
628
+ config: default
629
+ split: dev
630
+ revision: None
631
+ metrics:
632
+ - type: map_at_1
633
+ value: 56.3
634
+ - type: map_at_10
635
+ value: 62.193
636
+ - type: map_at_100
637
+ value: 62.722
638
+ - type: map_at_1000
639
+ value: 62.765
640
+ - type: map_at_3
641
+ value: 60.633
642
+ - type: map_at_5
643
+ value: 61.617999999999995
644
+ - type: mrr_at_1
645
+ value: 56.3
646
+ - type: mrr_at_10
647
+ value: 62.193
648
+ - type: mrr_at_100
649
+ value: 62.722
650
+ - type: mrr_at_1000
651
+ value: 62.765
652
+ - type: mrr_at_3
653
+ value: 60.633
654
+ - type: mrr_at_5
655
+ value: 61.617999999999995
656
+ - type: ndcg_at_1
657
+ value: 56.3
658
+ - type: ndcg_at_10
659
+ value: 65.176
660
+ - type: ndcg_at_100
661
+ value: 67.989
662
+ - type: ndcg_at_1000
663
+ value: 69.219
664
+ - type: ndcg_at_3
665
+ value: 62.014
666
+ - type: ndcg_at_5
667
+ value: 63.766
668
+ - type: precision_at_1
669
+ value: 56.3
670
+ - type: precision_at_10
671
+ value: 7.46
672
+ - type: precision_at_100
673
+ value: 0.8829999999999999
674
+ - type: precision_at_1000
675
+ value: 0.098
676
+ - type: precision_at_3
677
+ value: 22.0
678
+ - type: precision_at_5
679
+ value: 14.04
680
+ - type: recall_at_1
681
+ value: 56.3
682
+ - type: recall_at_10
683
+ value: 74.6
684
+ - type: recall_at_100
685
+ value: 88.3
686
+ - type: recall_at_1000
687
+ value: 98.1
688
+ - type: recall_at_3
689
+ value: 66.0
690
+ - type: recall_at_5
691
+ value: 70.19999999999999
692
+ - task:
693
+ type: Classification
694
+ dataset:
695
+ type: C-MTEB/MultilingualSentiment-classification
696
+ name: MTEB MultilingualSentiment
697
+ config: default
698
+ split: validation
699
+ revision: None
700
+ metrics:
701
+ - type: accuracy
702
+ value: 76.44666666666666
703
+ - type: f1
704
+ value: 76.34548655475949
705
+ - task:
706
+ type: PairClassification
707
+ dataset:
708
+ type: C-MTEB/OCNLI
709
+ name: MTEB Ocnli
710
+ config: default
711
+ split: validation
712
+ revision: None
713
+ metrics:
714
+ - type: cos_sim_accuracy
715
+ value: 82.34975636166757
716
+ - type: cos_sim_ap
717
+ value: 85.44149338593267
718
+ - type: cos_sim_f1
719
+ value: 83.68654509610647
720
+ - type: cos_sim_precision
721
+ value: 78.46580406654344
722
+ - type: cos_sim_recall
723
+ value: 89.65153115100317
724
+ - type: dot_accuracy
725
+ value: 82.34975636166757
726
+ - type: dot_ap
727
+ value: 85.4415701376729
728
+ - type: dot_f1
729
+ value: 83.68654509610647
730
+ - type: dot_precision
731
+ value: 78.46580406654344
732
+ - type: dot_recall
733
+ value: 89.65153115100317
734
+ - type: euclidean_accuracy
735
+ value: 82.34975636166757
736
+ - type: euclidean_ap
737
+ value: 85.4415701376729
738
+ - type: euclidean_f1
739
+ value: 83.68654509610647
740
+ - type: euclidean_precision
741
+ value: 78.46580406654344
742
+ - type: euclidean_recall
743
+ value: 89.65153115100317
744
+ - type: manhattan_accuracy
745
+ value: 81.97076340010828
746
+ - type: manhattan_ap
747
+ value: 84.83614660756733
748
+ - type: manhattan_f1
749
+ value: 83.34167083541772
750
+ - type: manhattan_precision
751
+ value: 79.18250950570342
752
+ - type: manhattan_recall
753
+ value: 87.96198521647307
754
+ - type: max_accuracy
755
+ value: 82.34975636166757
756
+ - type: max_ap
757
+ value: 85.4415701376729
758
+ - type: max_f1
759
+ value: 83.68654509610647
760
+ - task:
761
+ type: Classification
762
+ dataset:
763
+ type: C-MTEB/OnlineShopping-classification
764
+ name: MTEB OnlineShopping
765
+ config: default
766
+ split: test
767
+ revision: None
768
+ metrics:
769
+ - type: accuracy
770
+ value: 93.24
771
+ - type: ap
772
+ value: 91.3586656455605
773
+ - type: f1
774
+ value: 93.22999314249503
775
+ - task:
776
+ type: STS
777
+ dataset:
778
+ type: C-MTEB/PAWSX
779
+ name: MTEB PAWSX
780
+ config: default
781
+ split: test
782
+ revision: None
783
+ metrics:
784
+ - type: cos_sim_pearson
785
+ value: 39.05676042449009
786
+ - type: cos_sim_spearman
787
+ value: 44.996534098358545
788
+ - type: euclidean_pearson
789
+ value: 44.42418609172825
790
+ - type: euclidean_spearman
791
+ value: 44.995941361058996
792
+ - type: manhattan_pearson
793
+ value: 43.98118203238076
794
+ - type: manhattan_spearman
795
+ value: 44.51414152788784
796
+ - task:
797
+ type: STS
798
+ dataset:
799
+ type: C-MTEB/QBQTC
800
+ name: MTEB QBQTC
801
+ config: default
802
+ split: test
803
+ revision: None
804
+ metrics:
805
+ - type: cos_sim_pearson
806
+ value: 36.694269474438045
807
+ - type: cos_sim_spearman
808
+ value: 38.686738967031616
809
+ - type: euclidean_pearson
810
+ value: 36.822540068407235
811
+ - type: euclidean_spearman
812
+ value: 38.68690745429757
813
+ - type: manhattan_pearson
814
+ value: 36.77180703308932
815
+ - type: manhattan_spearman
816
+ value: 38.45414914148094
817
+ - task:
818
+ type: STS
819
+ dataset:
820
+ type: mteb/sts22-crosslingual-sts
821
+ name: MTEB STS22 (zh)
822
+ config: zh
823
+ split: test
824
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
825
+ metrics:
826
+ - type: cos_sim_pearson
827
+ value: 65.81209017614124
828
+ - type: cos_sim_spearman
829
+ value: 66.5255285833172
830
+ - type: euclidean_pearson
831
+ value: 66.01848701752732
832
+ - type: euclidean_spearman
833
+ value: 66.5255285833172
834
+ - type: manhattan_pearson
835
+ value: 66.66433676370542
836
+ - type: manhattan_spearman
837
+ value: 67.07086311480214
838
+ - task:
839
+ type: STS
840
+ dataset:
841
+ type: C-MTEB/STSB
842
+ name: MTEB STSB
843
+ config: default
844
+ split: test
845
+ revision: None
846
+ metrics:
847
+ - type: cos_sim_pearson
848
+ value: 80.60785761283502
849
+ - type: cos_sim_spearman
850
+ value: 82.80278693241074
851
+ - type: euclidean_pearson
852
+ value: 82.47573315938638
853
+ - type: euclidean_spearman
854
+ value: 82.80290808593806
855
+ - type: manhattan_pearson
856
+ value: 82.49682028989669
857
+ - type: manhattan_spearman
858
+ value: 82.84565039346022
859
+ - task:
860
+ type: Reranking
861
+ dataset:
862
+ type: C-MTEB/T2Reranking
863
+ name: MTEB T2Reranking
864
+ config: default
865
+ split: dev
866
+ revision: None
867
+ metrics:
868
+ - type: map
869
+ value: 66.37886004738723
870
+ - type: mrr
871
+ value: 76.08501655006394
872
+ - task:
873
+ type: Retrieval
874
+ dataset:
875
+ type: C-MTEB/T2Retrieval
876
+ name: MTEB T2Retrieval
877
+ config: default
878
+ split: dev
879
+ revision: None
880
+ metrics:
881
+ - type: map_at_1
882
+ value: 28.102
883
+ - type: map_at_10
884
+ value: 78.071
885
+ - type: map_at_100
886
+ value: 81.71000000000001
887
+ - type: map_at_1000
888
+ value: 81.773
889
+ - type: map_at_3
890
+ value: 55.142
891
+ - type: map_at_5
892
+ value: 67.669
893
+ - type: mrr_at_1
894
+ value: 90.9
895
+ - type: mrr_at_10
896
+ value: 93.29499999999999
897
+ - type: mrr_at_100
898
+ value: 93.377
899
+ - type: mrr_at_1000
900
+ value: 93.379
901
+ - type: mrr_at_3
902
+ value: 92.901
903
+ - type: mrr_at_5
904
+ value: 93.152
905
+ - type: ndcg_at_1
906
+ value: 90.9
907
+ - type: ndcg_at_10
908
+ value: 85.564
909
+ - type: ndcg_at_100
910
+ value: 89.11200000000001
911
+ - type: ndcg_at_1000
912
+ value: 89.693
913
+ - type: ndcg_at_3
914
+ value: 87.024
915
+ - type: ndcg_at_5
916
+ value: 85.66
917
+ - type: precision_at_1
918
+ value: 90.9
919
+ - type: precision_at_10
920
+ value: 42.208
921
+ - type: precision_at_100
922
+ value: 5.027
923
+ - type: precision_at_1000
924
+ value: 0.517
925
+ - type: precision_at_3
926
+ value: 75.872
927
+ - type: precision_at_5
928
+ value: 63.566
929
+ - type: recall_at_1
930
+ value: 28.102
931
+ - type: recall_at_10
932
+ value: 84.44500000000001
933
+ - type: recall_at_100
934
+ value: 95.91300000000001
935
+ - type: recall_at_1000
936
+ value: 98.80799999999999
937
+ - type: recall_at_3
938
+ value: 56.772999999999996
939
+ - type: recall_at_5
940
+ value: 70.99499999999999
941
+ - task:
942
+ type: Classification
943
+ dataset:
944
+ type: C-MTEB/TNews-classification
945
+ name: MTEB TNews
946
+ config: default
947
+ split: validation
948
+ revision: None
949
+ metrics:
950
+ - type: accuracy
951
+ value: 53.10599999999999
952
+ - type: f1
953
+ value: 51.40415523558322
954
+ - task:
955
+ type: Clustering
956
+ dataset:
957
+ type: C-MTEB/ThuNewsClusteringP2P
958
+ name: MTEB ThuNewsClusteringP2P
959
+ config: default
960
+ split: test
961
+ revision: None
962
+ metrics:
963
+ - type: v_measure
964
+ value: 69.6145576098232
965
+ - task:
966
+ type: Clustering
967
+ dataset:
968
+ type: C-MTEB/ThuNewsClusteringS2S
969
+ name: MTEB ThuNewsClusteringS2S
970
+ config: default
971
+ split: test
972
+ revision: None
973
+ metrics:
974
+ - type: v_measure
975
+ value: 63.7129548775017
976
+ - task:
977
+ type: Retrieval
978
+ dataset:
979
+ type: C-MTEB/VideoRetrieval
980
+ name: MTEB VideoRetrieval
981
+ config: default
982
+ split: dev
983
+ revision: None
984
+ metrics:
985
+ - type: map_at_1
986
+ value: 60.199999999999996
987
+ - type: map_at_10
988
+ value: 69.724
989
+ - type: map_at_100
990
+ value: 70.185
991
+ - type: map_at_1000
992
+ value: 70.196
993
+ - type: map_at_3
994
+ value: 67.95
995
+ - type: map_at_5
996
+ value: 69.155
997
+ - type: mrr_at_1
998
+ value: 60.199999999999996
999
+ - type: mrr_at_10
1000
+ value: 69.724
1001
+ - type: mrr_at_100
1002
+ value: 70.185
1003
+ - type: mrr_at_1000
1004
+ value: 70.196
1005
+ - type: mrr_at_3
1006
+ value: 67.95
1007
+ - type: mrr_at_5
1008
+ value: 69.155
1009
+ - type: ndcg_at_1
1010
+ value: 60.199999999999996
1011
+ - type: ndcg_at_10
1012
+ value: 73.888
1013
+ - type: ndcg_at_100
1014
+ value: 76.02799999999999
1015
+ - type: ndcg_at_1000
1016
+ value: 76.344
1017
+ - type: ndcg_at_3
1018
+ value: 70.384
1019
+ - type: ndcg_at_5
1020
+ value: 72.541
1021
+ - type: precision_at_1
1022
+ value: 60.199999999999996
1023
+ - type: precision_at_10
1024
+ value: 8.67
1025
+ - type: precision_at_100
1026
+ value: 0.9650000000000001
1027
+ - type: precision_at_1000
1028
+ value: 0.099
1029
+ - type: precision_at_3
1030
+ value: 25.8
1031
+ - type: precision_at_5
1032
+ value: 16.520000000000003
1033
+ - type: recall_at_1
1034
+ value: 60.199999999999996
1035
+ - type: recall_at_10
1036
+ value: 86.7
1037
+ - type: recall_at_100
1038
+ value: 96.5
1039
+ - type: recall_at_1000
1040
+ value: 99.0
1041
+ - type: recall_at_3
1042
+ value: 77.4
1043
+ - type: recall_at_5
1044
+ value: 82.6
1045
+ - task:
1046
+ type: Classification
1047
+ dataset:
1048
+ type: C-MTEB/waimai-classification
1049
+ name: MTEB Waimai
1050
+ config: default
1051
+ split: test
1052
+ revision: None
1053
+ metrics:
1054
+ - type: accuracy
1055
+ value: 88.08
1056
+ - type: ap
1057
+ value: 72.66435456846166
1058
+ - type: f1
1059
+ value: 86.55995793551286
1060
  ---
1061
+
1062
+ # 1 开源清单
1063
+
1064
+ 本次开源2个通用向量编码模型和一个针对dialogue进行编码的向量模型,同时开源全量160万对话重写数据集和20万的难负例的检索数据集。
1065
+
1066
+ **开源模型:**
1067
+
1068
+ | ModelName | ModelSize | MaxTokens | EmbeddingDimensions | Language | Scenario | C-MTEB Score |
1069
+ |---------------------------------------------------------------------------------------------------------------|-----------|-----------|---------------------|----------|----------|--------------|
1070
+ | [infgrad/stella-base-zh-v3-1792d](https://huggingface.co/infgrad/stella-base-zh-v3-1792d) | 0.4GB | 512 | 1792 | zh-CN | 通用文本 | 67.96 |
1071
+ | [infgrad/stella-large-zh-v3-1792d](https://huggingface.co/infgrad/stella-large-zh-v3-1792d) | 1.3GB | 512 | 1792 | zh-CN | 通用文本 | 68.48 |
1072
+ | [infgrad/stella-dialogue-large-zh-v3-1792d](https://huggingface.co/infgrad/stella-dialogue-large-zh-v3-1792d) | 1.3GB | 512 | 1792 | zh-CN | **对话文本** | 不适用 |
1073
+
1074
+ **开源数据:**
1075
+
1076
+ 1. [全量对话重写数据集](https://huggingface.co/datasets/infgrad/dialogue_rewrite_llm) 约160万
1077
+ 2. [部分带有难负例的检索数据集](https://huggingface.co/datasets/infgrad/retrieval_data_llm) 约20万
1078
+
1079
+ 上述数据集均使用LLM构造,欢迎各位贡献数据集。
1080
+
1081
+ # 2 使用方法
1082
+
1083
+ ## 2.1 通用编码模型使用方法
1084
+
1085
+ 直接SentenceTransformer加载即可:
1086
+
1087
+ ```python
1088
+ from sentence_transformers import SentenceTransformer
1089
+
1090
+ model = SentenceTransformer("infgrad/stella-base-zh-v3-1792d")
1091
+ # model = SentenceTransformer("infgrad/stella-large-zh-v3-1792d")
1092
+ vectors = model.encode(["text1", "text2"])
1093
+ ```
1094
+
1095
+ ## 2.2 dialogue编码模型使用方法
1096
+
1097
+ **使用场景:**
1098
+ **在一段对话中,需要根据用户语句去检索相关文本,但是对话中的用户语句存在大量的指代和省略,导致直接使用通用编码模型效果不好,
1099
+ 可以使用本项目的专门的dialogue编码模型进行编码**
1100
+
1101
+ **使用要点:**
1102
+
1103
+ 1. 对dialogue进行编码时,dialogue中的每个utterance需要是如下格式:`"{ROLE}: {TEXT}"`,然后使用`[SEP]` join一下
1104
+ 2. 整个对话都要送入模型进行编码,如果长度不够就删掉早期的对话,**编码后的向量本质是对话中最后一句话的重写版本的向量!!**
1105
+ 3. 对话用stella-dialogue-large-zh-v3-1792d编码,被检索文本使用stella-large-zh-v3-1792d进行编码,所以本场景是需要2个编码模型的
1106
+
1107
+ 如果对使用方法还有疑惑,请到下面章节阅读该模型是如何训练的。
1108
+
1109
+ 使用示例:
1110
+
1111
+ ```python
1112
+ from sentence_transformers import SentenceTransformer
1113
+
1114
+ dial_model = SentenceTransformer("infgrad/stella-dialogue-large-zh-v3-1792d")
1115
+ general_model = SentenceTransformer("infgrad/stella-large-zh-v3-1792d")
1116
+ # dialogue = ["张三: 吃饭吗", "李四: 等会去"]
1117
+ dialogue = ["A: 最近去打篮球了吗", "B: 没有"]
1118
+ corpus = ["B没打篮球是因为受伤了。", "B没有打乒乓球"]
1119
+ last_utterance_vector = dial_model.encode(["[SEP]".join(dialogue)], normalize_embeddings=True)
1120
+ corpus_vectors = general_model.encode(corpus, normalize_embeddings=True)
1121
+ # 计算相似度
1122
+ sims = (last_utterance_vector * corpus_vectors).sum(axis=1)
1123
+ print(sims)
1124
+ ```
1125
+
1126
+ # 3 通用编码模型训练技巧分享
1127
+
1128
+ ## hard negative
1129
+
1130
+ 难负例挖掘也是个经典的trick了,几乎总能提升效果
1131
+
1132
+ ## dropout-1d
1133
+
1134
+ dropout已经是深度学习的标配,我们可以稍微改造下使其更适合句向量的训练。
1135
+ 我们在训练时会尝试让每一个token-embedding都可以表征整个句子,而在推理时使用mean_pooling从而达到类似模型融合的效果。
1136
+ 具体操作是在mean_pooling时加入dropout_1d,torch代码如下:
1137
+
1138
+ ```python
1139
+ vector_dropout = nn.Dropout1d(0.3) # 算力有限,试了0.3和0.5 两个参数,其中0.3更优
1140
+ last_hidden_state = bert_model(...)[0]
1141
+ last_hidden = last_hidden_state.masked_fill(~attention_mask[..., None].bool(), 0.0)
1142
+ last_hidden = vector_dropout(last_hidden)
1143
+ vectors = last_hidden.sum(dim=1) / attention_mask.sum(dim=1)[..., None]
1144
+ ```
1145
+
1146
+ # 4 dialogue编码模型细节
1147
+
1148
+ ## 4.1 为什么需要一个dialogue编码模型?
1149
+
1150
+ 参见本人历史文章:https://www.zhihu.com/pin/1674913544847077376
1151
+
1152
+ ## 4.2 训练数据
1153
+
1154
+ 单条数据示例:
1155
+
1156
+ ```json
1157
+ {
1158
+ "dialogue": [
1159
+ "A: 最近去打篮球了吗",
1160
+ "B: 没有"
1161
+ ],
1162
+ "last_utterance_rewrite": "B: 我最近没有去打篮球"
1163
+ }
1164
+ ```
1165
+
1166
+ ## 4.3 训练Loss
1167
+
1168
+ ```
1169
+ loss = cosine_loss( dial_model.encode(dialogue), existing_model.encode(last_utterance_rewrite) )
1170
+ ```
1171
+
1172
+ dial_model就是要被训练的模型,本人是以stella-large-zh-v3-1792d作为base-model进行继续训练的
1173
+
1174
+ existing_model就是现有训练好的**通用编码模型**,本人使用的是stella-large-zh-v3-1792d
1175
+
1176
+ 已开源dialogue-embedding的全量训练数据,理论上可以复现本模型效果。
1177
+
1178
+ Loss下降情况:
1179
+
1180
+ <div align="center">
1181
+ <img src="dial_loss.png" alt="icon" width="2000px"/>
1182
+ </div>
1183
+
1184
+ ## 4.4 效果
1185
+
1186
+ 目前还没有专门测试集,本人简单测试了下是有效果的,部分测试结果见文件`dial_retrieval_test.xlsx`。
1187
+
1188
+ # 5 后续TODO
1189
+
1190
+ 1. 更多的dial-rewrite数据
1191
+ 2. 不同EmbeddingDimensions的编码模型
1192
+
1193
+ # 6 FAQ
1194
+
1195
+ Q: 为什么向量维度是1792?\
1196
+ A: 最初考虑发布768、1024,768+768,1024+1024,1024+768维度,但是时间有限,先做了1792就只发布1792维度的模型。理论上维度越高效果越好。
1197
+
1198
+ Q: 如何复现CMTEB效果?\
1199
+ A: SentenceTransformer加载后直接用官方评测脚本就行,注意对于Classification任务向量需要先normalize一下
1200
+
1201
+ Q: 复现的CMTEB效果和本文不一致?\
1202
+ A: 聚类不一致正常,官方评测代码没有设定seed,其他不一致建议检查代码或联系本人。
1203
+
1204
+ Q: 如何选择向量模型?\
1205
+ A: 没有免费的午餐,在自己测试集上试试,本人推荐bge、e5和stella.
1206
+
1207
+ Q: 长度为什么只有512,能否更长?\
1208
+ A: 可以但没必要,长了效果普遍不好,这是当前训练方法和数据导致的,几乎无解,建议长文本还是走分块。
1209
+
1210
+ Q: 训练资源和算力?\
1211
+ A: 亿级别的数据,单卡A100要一个月起步