barisaydin commited on
Commit
6999a59
1 Parent(s): 485ec9c

Upload folder using huggingface_hub

Browse files
.DS_Store ADDED
Binary file (6.15 kB). View file
 
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
+ unigram.json filter=lfs diff=lfs merge=lfs -text
38
+ all.jsonl filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,3085 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ license: apache-2.0
4
+ tags:
5
+ - text2vec
6
+ - feature-extraction
7
+ - sentence-similarity
8
+ - transformers
9
+ - mteb
10
+ datasets:
11
+ - >-
12
+ https://huggingface.co/datasets/shibing624/nli-zh-all/tree/main/text2vec-base-multilingual-dataset
13
+ language:
14
+ - zh
15
+ - en
16
+ - de
17
+ - fr
18
+ - it
19
+ - nl
20
+ - pt
21
+ - pl
22
+ - ru
23
+ metrics:
24
+ - spearmanr
25
+ library_name: transformers
26
+ model-index:
27
+ - name: text2vec-base-multilingual
28
+ results:
29
+ - task:
30
+ type: Classification
31
+ dataset:
32
+ type: mteb/amazon_counterfactual
33
+ name: MTEB AmazonCounterfactualClassification (en)
34
+ config: en
35
+ split: test
36
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
37
+ metrics:
38
+ - type: accuracy
39
+ value: 70.97014925373134
40
+ - type: ap
41
+ value: 33.95151328318672
42
+ - type: f1
43
+ value: 65.14740155705596
44
+ - task:
45
+ type: Classification
46
+ dataset:
47
+ type: mteb/amazon_counterfactual
48
+ name: MTEB AmazonCounterfactualClassification (de)
49
+ config: de
50
+ split: test
51
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
52
+ metrics:
53
+ - type: accuracy
54
+ value: 68.69379014989293
55
+ - type: ap
56
+ value: 79.68277579733802
57
+ - type: f1
58
+ value: 66.54960052336921
59
+ - task:
60
+ type: Classification
61
+ dataset:
62
+ type: mteb/amazon_counterfactual
63
+ name: MTEB AmazonCounterfactualClassification (en-ext)
64
+ config: en-ext
65
+ split: test
66
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
67
+ metrics:
68
+ - type: accuracy
69
+ value: 70.90704647676162
70
+ - type: ap
71
+ value: 20.747518928580437
72
+ - type: f1
73
+ value: 58.64365465884924
74
+ - task:
75
+ type: Classification
76
+ dataset:
77
+ type: mteb/amazon_counterfactual
78
+ name: MTEB AmazonCounterfactualClassification (ja)
79
+ config: ja
80
+ split: test
81
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
82
+ metrics:
83
+ - type: accuracy
84
+ value: 61.605995717344754
85
+ - type: ap
86
+ value: 14.135974879487028
87
+ - type: f1
88
+ value: 49.980224800472136
89
+ - task:
90
+ type: Classification
91
+ dataset:
92
+ type: mteb/amazon_polarity
93
+ name: MTEB AmazonPolarityClassification
94
+ config: default
95
+ split: test
96
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
97
+ metrics:
98
+ - type: accuracy
99
+ value: 66.103375
100
+ - type: ap
101
+ value: 61.10087197664471
102
+ - type: f1
103
+ value: 65.75198509894145
104
+ - task:
105
+ type: Classification
106
+ dataset:
107
+ type: mteb/amazon_reviews_multi
108
+ name: MTEB AmazonReviewsClassification (en)
109
+ config: en
110
+ split: test
111
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
112
+ metrics:
113
+ - type: accuracy
114
+ value: 33.134
115
+ - type: f1
116
+ value: 32.7905397597083
117
+ - task:
118
+ type: Classification
119
+ dataset:
120
+ type: mteb/amazon_reviews_multi
121
+ name: MTEB AmazonReviewsClassification (de)
122
+ config: de
123
+ split: test
124
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
125
+ metrics:
126
+ - type: accuracy
127
+ value: 33.388
128
+ - type: f1
129
+ value: 33.190561196873084
130
+ - task:
131
+ type: Classification
132
+ dataset:
133
+ type: mteb/amazon_reviews_multi
134
+ name: MTEB AmazonReviewsClassification (es)
135
+ config: es
136
+ split: test
137
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
138
+ metrics:
139
+ - type: accuracy
140
+ value: 34.824
141
+ - type: f1
142
+ value: 34.297290157740726
143
+ - task:
144
+ type: Classification
145
+ dataset:
146
+ type: mteb/amazon_reviews_multi
147
+ name: MTEB AmazonReviewsClassification (fr)
148
+ config: fr
149
+ split: test
150
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
151
+ metrics:
152
+ - type: accuracy
153
+ value: 33.449999999999996
154
+ - type: f1
155
+ value: 33.08017234412433
156
+ - task:
157
+ type: Classification
158
+ dataset:
159
+ type: mteb/amazon_reviews_multi
160
+ name: MTEB AmazonReviewsClassification (ja)
161
+ config: ja
162
+ split: test
163
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
164
+ metrics:
165
+ - type: accuracy
166
+ value: 30.046
167
+ - type: f1
168
+ value: 29.857141661482228
169
+ - task:
170
+ type: Classification
171
+ dataset:
172
+ type: mteb/amazon_reviews_multi
173
+ name: MTEB AmazonReviewsClassification (zh)
174
+ config: zh
175
+ split: test
176
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
177
+ metrics:
178
+ - type: accuracy
179
+ value: 32.522
180
+ - type: f1
181
+ value: 31.854699911472174
182
+ - task:
183
+ type: Clustering
184
+ dataset:
185
+ type: mteb/arxiv-clustering-p2p
186
+ name: MTEB ArxivClusteringP2P
187
+ config: default
188
+ split: test
189
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
190
+ metrics:
191
+ - type: v_measure
192
+ value: 32.31918856561886
193
+ - task:
194
+ type: Clustering
195
+ dataset:
196
+ type: mteb/arxiv-clustering-s2s
197
+ name: MTEB ArxivClusteringS2S
198
+ config: default
199
+ split: test
200
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
201
+ metrics:
202
+ - type: v_measure
203
+ value: 25.503481615956137
204
+ - task:
205
+ type: Reranking
206
+ dataset:
207
+ type: mteb/askubuntudupquestions-reranking
208
+ name: MTEB AskUbuntuDupQuestions
209
+ config: default
210
+ split: test
211
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
212
+ metrics:
213
+ - type: map
214
+ value: 57.91471462820568
215
+ - type: mrr
216
+ value: 71.82990370663501
217
+ - task:
218
+ type: STS
219
+ dataset:
220
+ type: mteb/biosses-sts
221
+ name: MTEB BIOSSES
222
+ config: default
223
+ split: test
224
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
225
+ metrics:
226
+ - type: cos_sim_pearson
227
+ value: 68.83853315193127
228
+ - type: cos_sim_spearman
229
+ value: 66.16174850417771
230
+ - type: euclidean_pearson
231
+ value: 56.65313897263153
232
+ - type: euclidean_spearman
233
+ value: 52.69156205876939
234
+ - type: manhattan_pearson
235
+ value: 56.97282154658304
236
+ - type: manhattan_spearman
237
+ value: 53.167476517261015
238
+ - task:
239
+ type: Classification
240
+ dataset:
241
+ type: mteb/banking77
242
+ name: MTEB Banking77Classification
243
+ config: default
244
+ split: test
245
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
246
+ metrics:
247
+ - type: accuracy
248
+ value: 78.08441558441558
249
+ - type: f1
250
+ value: 77.99825264827898
251
+ - task:
252
+ type: Clustering
253
+ dataset:
254
+ type: mteb/biorxiv-clustering-p2p
255
+ name: MTEB BiorxivClusteringP2P
256
+ config: default
257
+ split: test
258
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
259
+ metrics:
260
+ - type: v_measure
261
+ value: 28.98583420521256
262
+ - task:
263
+ type: Clustering
264
+ dataset:
265
+ type: mteb/biorxiv-clustering-s2s
266
+ name: MTEB BiorxivClusteringS2S
267
+ config: default
268
+ split: test
269
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
270
+ metrics:
271
+ - type: v_measure
272
+ value: 23.195091778460892
273
+ - task:
274
+ type: Classification
275
+ dataset:
276
+ type: mteb/emotion
277
+ name: MTEB EmotionClassification
278
+ config: default
279
+ split: test
280
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
281
+ metrics:
282
+ - type: accuracy
283
+ value: 43.35
284
+ - type: f1
285
+ value: 38.80269436557695
286
+ - task:
287
+ type: Classification
288
+ dataset:
289
+ type: mteb/imdb
290
+ name: MTEB ImdbClassification
291
+ config: default
292
+ split: test
293
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
294
+ metrics:
295
+ - type: accuracy
296
+ value: 59.348
297
+ - type: ap
298
+ value: 55.75065220262251
299
+ - type: f1
300
+ value: 58.72117519082607
301
+ - task:
302
+ type: Classification
303
+ dataset:
304
+ type: mteb/mtop_domain
305
+ name: MTEB MTOPDomainClassification (en)
306
+ config: en
307
+ split: test
308
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
309
+ metrics:
310
+ - type: accuracy
311
+ value: 81.04879160966712
312
+ - type: f1
313
+ value: 80.86889779192701
314
+ - task:
315
+ type: Classification
316
+ dataset:
317
+ type: mteb/mtop_domain
318
+ name: MTEB MTOPDomainClassification (de)
319
+ config: de
320
+ split: test
321
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
322
+ metrics:
323
+ - type: accuracy
324
+ value: 78.59397013243168
325
+ - type: f1
326
+ value: 77.09902761555972
327
+ - task:
328
+ type: Classification
329
+ dataset:
330
+ type: mteb/mtop_domain
331
+ name: MTEB MTOPDomainClassification (es)
332
+ config: es
333
+ split: test
334
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
335
+ metrics:
336
+ - type: accuracy
337
+ value: 79.24282855236824
338
+ - type: f1
339
+ value: 78.75883867079015
340
+ - task:
341
+ type: Classification
342
+ dataset:
343
+ type: mteb/mtop_domain
344
+ name: MTEB MTOPDomainClassification (fr)
345
+ config: fr
346
+ split: test
347
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
348
+ metrics:
349
+ - type: accuracy
350
+ value: 76.16661446915127
351
+ - type: f1
352
+ value: 76.30204722831901
353
+ - task:
354
+ type: Classification
355
+ dataset:
356
+ type: mteb/mtop_domain
357
+ name: MTEB MTOPDomainClassification (hi)
358
+ config: hi
359
+ split: test
360
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
361
+ metrics:
362
+ - type: accuracy
363
+ value: 78.74506991753317
364
+ - type: f1
365
+ value: 77.50560442779701
366
+ - task:
367
+ type: Classification
368
+ dataset:
369
+ type: mteb/mtop_domain
370
+ name: MTEB MTOPDomainClassification (th)
371
+ config: th
372
+ split: test
373
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
374
+ metrics:
375
+ - type: accuracy
376
+ value: 77.67088607594937
377
+ - type: f1
378
+ value: 77.21442956887493
379
+ - task:
380
+ type: Classification
381
+ dataset:
382
+ type: mteb/mtop_intent
383
+ name: MTEB MTOPIntentClassification (en)
384
+ config: en
385
+ split: test
386
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
387
+ metrics:
388
+ - type: accuracy
389
+ value: 62.786137710898316
390
+ - type: f1
391
+ value: 46.23474201126368
392
+ - task:
393
+ type: Classification
394
+ dataset:
395
+ type: mteb/mtop_intent
396
+ name: MTEB MTOPIntentClassification (de)
397
+ config: de
398
+ split: test
399
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
400
+ metrics:
401
+ - type: accuracy
402
+ value: 55.285996055226825
403
+ - type: f1
404
+ value: 37.98039513682919
405
+ - task:
406
+ type: Classification
407
+ dataset:
408
+ type: mteb/mtop_intent
409
+ name: MTEB MTOPIntentClassification (es)
410
+ config: es
411
+ split: test
412
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
413
+ metrics:
414
+ - type: accuracy
415
+ value: 58.67911941294196
416
+ - type: f1
417
+ value: 40.541410807124954
418
+ - task:
419
+ type: Classification
420
+ dataset:
421
+ type: mteb/mtop_intent
422
+ name: MTEB MTOPIntentClassification (fr)
423
+ config: fr
424
+ split: test
425
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
426
+ metrics:
427
+ - type: accuracy
428
+ value: 53.257124960851854
429
+ - type: f1
430
+ value: 38.42982319259366
431
+ - task:
432
+ type: Classification
433
+ dataset:
434
+ type: mteb/mtop_intent
435
+ name: MTEB MTOPIntentClassification (hi)
436
+ config: hi
437
+ split: test
438
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
439
+ metrics:
440
+ - type: accuracy
441
+ value: 59.62352097525995
442
+ - type: f1
443
+ value: 41.28886486568534
444
+ - task:
445
+ type: Classification
446
+ dataset:
447
+ type: mteb/mtop_intent
448
+ name: MTEB MTOPIntentClassification (th)
449
+ config: th
450
+ split: test
451
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
452
+ metrics:
453
+ - type: accuracy
454
+ value: 58.799276672694404
455
+ - type: f1
456
+ value: 43.68379466247341
457
+ - task:
458
+ type: Classification
459
+ dataset:
460
+ type: mteb/amazon_massive_intent
461
+ name: MTEB MassiveIntentClassification (af)
462
+ config: af
463
+ split: test
464
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
465
+ metrics:
466
+ - type: accuracy
467
+ value: 45.42030934767989
468
+ - type: f1
469
+ value: 44.12201543566376
470
+ - task:
471
+ type: Classification
472
+ dataset:
473
+ type: mteb/amazon_massive_intent
474
+ name: MTEB MassiveIntentClassification (am)
475
+ config: am
476
+ split: test
477
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
478
+ metrics:
479
+ - type: accuracy
480
+ value: 37.67652992602556
481
+ - type: f1
482
+ value: 35.422091900843164
483
+ - task:
484
+ type: Classification
485
+ dataset:
486
+ type: mteb/amazon_massive_intent
487
+ name: MTEB MassiveIntentClassification (ar)
488
+ config: ar
489
+ split: test
490
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
491
+ metrics:
492
+ - type: accuracy
493
+ value: 45.02353732347007
494
+ - type: f1
495
+ value: 41.852484084738194
496
+ - task:
497
+ type: Classification
498
+ dataset:
499
+ type: mteb/amazon_massive_intent
500
+ name: MTEB MassiveIntentClassification (az)
501
+ config: az
502
+ split: test
503
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
504
+ metrics:
505
+ - type: accuracy
506
+ value: 48.70880968392737
507
+ - type: f1
508
+ value: 46.904360615435046
509
+ - task:
510
+ type: Classification
511
+ dataset:
512
+ type: mteb/amazon_massive_intent
513
+ name: MTEB MassiveIntentClassification (bn)
514
+ config: bn
515
+ split: test
516
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
517
+ metrics:
518
+ - type: accuracy
519
+ value: 43.78950907868191
520
+ - type: f1
521
+ value: 41.58872353920405
522
+ - task:
523
+ type: Classification
524
+ dataset:
525
+ type: mteb/amazon_massive_intent
526
+ name: MTEB MassiveIntentClassification (cy)
527
+ config: cy
528
+ split: test
529
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
530
+ metrics:
531
+ - type: accuracy
532
+ value: 28.759246805648957
533
+ - type: f1
534
+ value: 27.41182001374226
535
+ - task:
536
+ type: Classification
537
+ dataset:
538
+ type: mteb/amazon_massive_intent
539
+ name: MTEB MassiveIntentClassification (da)
540
+ config: da
541
+ split: test
542
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
543
+ metrics:
544
+ - type: accuracy
545
+ value: 56.74176193678547
546
+ - type: f1
547
+ value: 53.82727354182497
548
+ - task:
549
+ type: Classification
550
+ dataset:
551
+ type: mteb/amazon_massive_intent
552
+ name: MTEB MassiveIntentClassification (de)
553
+ config: de
554
+ split: test
555
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
556
+ metrics:
557
+ - type: accuracy
558
+ value: 51.55682582380632
559
+ - type: f1
560
+ value: 49.41963627941866
561
+ - task:
562
+ type: Classification
563
+ dataset:
564
+ type: mteb/amazon_massive_intent
565
+ name: MTEB MassiveIntentClassification (el)
566
+ config: el
567
+ split: test
568
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
569
+ metrics:
570
+ - type: accuracy
571
+ value: 56.46940147948891
572
+ - type: f1
573
+ value: 55.28178711367465
574
+ - task:
575
+ type: Classification
576
+ dataset:
577
+ type: mteb/amazon_massive_intent
578
+ name: MTEB MassiveIntentClassification (en)
579
+ config: en
580
+ split: test
581
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
582
+ metrics:
583
+ - type: accuracy
584
+ value: 63.83322125084063
585
+ - type: f1
586
+ value: 61.836172900845554
587
+ - task:
588
+ type: Classification
589
+ dataset:
590
+ type: mteb/amazon_massive_intent
591
+ name: MTEB MassiveIntentClassification (es)
592
+ config: es
593
+ split: test
594
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
595
+ metrics:
596
+ - type: accuracy
597
+ value: 58.27505043712172
598
+ - type: f1
599
+ value: 57.642436374361154
600
+ - task:
601
+ type: Classification
602
+ dataset:
603
+ type: mteb/amazon_massive_intent
604
+ name: MTEB MassiveIntentClassification (fa)
605
+ config: fa
606
+ split: test
607
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
608
+ metrics:
609
+ - type: accuracy
610
+ value: 59.05178211163417
611
+ - type: f1
612
+ value: 56.858998820504056
613
+ - task:
614
+ type: Classification
615
+ dataset:
616
+ type: mteb/amazon_massive_intent
617
+ name: MTEB MassiveIntentClassification (fi)
618
+ config: fi
619
+ split: test
620
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
621
+ metrics:
622
+ - type: accuracy
623
+ value: 57.357094821788834
624
+ - type: f1
625
+ value: 54.79711189260453
626
+ - task:
627
+ type: Classification
628
+ dataset:
629
+ type: mteb/amazon_massive_intent
630
+ name: MTEB MassiveIntentClassification (fr)
631
+ config: fr
632
+ split: test
633
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
634
+ metrics:
635
+ - type: accuracy
636
+ value: 58.79959650302623
637
+ - type: f1
638
+ value: 57.59158671719513
639
+ - task:
640
+ type: Classification
641
+ dataset:
642
+ type: mteb/amazon_massive_intent
643
+ name: MTEB MassiveIntentClassification (he)
644
+ config: he
645
+ split: test
646
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
647
+ metrics:
648
+ - type: accuracy
649
+ value: 51.1768661735037
650
+ - type: f1
651
+ value: 48.886397276270515
652
+ - task:
653
+ type: Classification
654
+ dataset:
655
+ type: mteb/amazon_massive_intent
656
+ name: MTEB MassiveIntentClassification (hi)
657
+ config: hi
658
+ split: test
659
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
660
+ metrics:
661
+ - type: accuracy
662
+ value: 57.06455951580362
663
+ - type: f1
664
+ value: 55.01530952684585
665
+ - task:
666
+ type: Classification
667
+ dataset:
668
+ type: mteb/amazon_massive_intent
669
+ name: MTEB MassiveIntentClassification (hu)
670
+ config: hu
671
+ split: test
672
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
673
+ metrics:
674
+ - type: accuracy
675
+ value: 58.3591123066577
676
+ - type: f1
677
+ value: 55.9277783370191
678
+ - task:
679
+ type: Classification
680
+ dataset:
681
+ type: mteb/amazon_massive_intent
682
+ name: MTEB MassiveIntentClassification (hy)
683
+ config: hy
684
+ split: test
685
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
686
+ metrics:
687
+ - type: accuracy
688
+ value: 52.108271687962336
689
+ - type: f1
690
+ value: 51.195023400664596
691
+ - task:
692
+ type: Classification
693
+ dataset:
694
+ type: mteb/amazon_massive_intent
695
+ name: MTEB MassiveIntentClassification (id)
696
+ config: id
697
+ split: test
698
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
699
+ metrics:
700
+ - type: accuracy
701
+ value: 58.26832548755883
702
+ - type: f1
703
+ value: 56.60774065423401
704
+ - task:
705
+ type: Classification
706
+ dataset:
707
+ type: mteb/amazon_massive_intent
708
+ name: MTEB MassiveIntentClassification (is)
709
+ config: is
710
+ split: test
711
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
712
+ metrics:
713
+ - type: accuracy
714
+ value: 35.806993947545394
715
+ - type: f1
716
+ value: 34.290418953173294
717
+ - task:
718
+ type: Classification
719
+ dataset:
720
+ type: mteb/amazon_massive_intent
721
+ name: MTEB MassiveIntentClassification (it)
722
+ config: it
723
+ split: test
724
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
725
+ metrics:
726
+ - type: accuracy
727
+ value: 58.27841291190315
728
+ - type: f1
729
+ value: 56.9438998642419
730
+ - task:
731
+ type: Classification
732
+ dataset:
733
+ type: mteb/amazon_massive_intent
734
+ name: MTEB MassiveIntentClassification (ja)
735
+ config: ja
736
+ split: test
737
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
738
+ metrics:
739
+ - type: accuracy
740
+ value: 60.78009414929389
741
+ - type: f1
742
+ value: 59.15780842483667
743
+ - task:
744
+ type: Classification
745
+ dataset:
746
+ type: mteb/amazon_massive_intent
747
+ name: MTEB MassiveIntentClassification (jv)
748
+ config: jv
749
+ split: test
750
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
751
+ metrics:
752
+ - type: accuracy
753
+ value: 31.153328850033624
754
+ - type: f1
755
+ value: 30.11004596099605
756
+ - task:
757
+ type: Classification
758
+ dataset:
759
+ type: mteb/amazon_massive_intent
760
+ name: MTEB MassiveIntentClassification (ka)
761
+ config: ka
762
+ split: test
763
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
764
+ metrics:
765
+ - type: accuracy
766
+ value: 44.50235373234701
767
+ - type: f1
768
+ value: 44.040585262624745
769
+ - task:
770
+ type: Classification
771
+ dataset:
772
+ type: mteb/amazon_massive_intent
773
+ name: MTEB MassiveIntentClassification (km)
774
+ config: km
775
+ split: test
776
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
777
+ metrics:
778
+ - type: accuracy
779
+ value: 40.99193006052455
780
+ - type: f1
781
+ value: 39.505480119272484
782
+ - task:
783
+ type: Classification
784
+ dataset:
785
+ type: mteb/amazon_massive_intent
786
+ name: MTEB MassiveIntentClassification (kn)
787
+ config: kn
788
+ split: test
789
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
790
+ metrics:
791
+ - type: accuracy
792
+ value: 46.95696032279758
793
+ - type: f1
794
+ value: 43.093638940785326
795
+ - task:
796
+ type: Classification
797
+ dataset:
798
+ type: mteb/amazon_massive_intent
799
+ name: MTEB MassiveIntentClassification (ko)
800
+ config: ko
801
+ split: test
802
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
803
+ metrics:
804
+ - type: accuracy
805
+ value: 54.73100201748486
806
+ - type: f1
807
+ value: 52.79750744404114
808
+ - task:
809
+ type: Classification
810
+ dataset:
811
+ type: mteb/amazon_massive_intent
812
+ name: MTEB MassiveIntentClassification (lv)
813
+ config: lv
814
+ split: test
815
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
816
+ metrics:
817
+ - type: accuracy
818
+ value: 54.865501008742434
819
+ - type: f1
820
+ value: 53.64798408964839
821
+ - task:
822
+ type: Classification
823
+ dataset:
824
+ type: mteb/amazon_massive_intent
825
+ name: MTEB MassiveIntentClassification (ml)
826
+ config: ml
827
+ split: test
828
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
829
+ metrics:
830
+ - type: accuracy
831
+ value: 47.891728312037664
832
+ - type: f1
833
+ value: 45.261229414636055
834
+ - task:
835
+ type: Classification
836
+ dataset:
837
+ type: mteb/amazon_massive_intent
838
+ name: MTEB MassiveIntentClassification (mn)
839
+ config: mn
840
+ split: test
841
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
842
+ metrics:
843
+ - type: accuracy
844
+ value: 52.2259583053127
845
+ - type: f1
846
+ value: 50.5903419246987
847
+ - task:
848
+ type: Classification
849
+ dataset:
850
+ type: mteb/amazon_massive_intent
851
+ name: MTEB MassiveIntentClassification (ms)
852
+ config: ms
853
+ split: test
854
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
855
+ metrics:
856
+ - type: accuracy
857
+ value: 54.277067921990586
858
+ - type: f1
859
+ value: 52.472042479965886
860
+ - task:
861
+ type: Classification
862
+ dataset:
863
+ type: mteb/amazon_massive_intent
864
+ name: MTEB MassiveIntentClassification (my)
865
+ config: my
866
+ split: test
867
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
868
+ metrics:
869
+ - type: accuracy
870
+ value: 51.95696032279757
871
+ - type: f1
872
+ value: 49.79330411854258
873
+ - task:
874
+ type: Classification
875
+ dataset:
876
+ type: mteb/amazon_massive_intent
877
+ name: MTEB MassiveIntentClassification (nb)
878
+ config: nb
879
+ split: test
880
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
881
+ metrics:
882
+ - type: accuracy
883
+ value: 54.63685272360457
884
+ - type: f1
885
+ value: 52.81267480650003
886
+ - task:
887
+ type: Classification
888
+ dataset:
889
+ type: mteb/amazon_massive_intent
890
+ name: MTEB MassiveIntentClassification (nl)
891
+ config: nl
892
+ split: test
893
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
894
+ metrics:
895
+ - type: accuracy
896
+ value: 59.451916610625425
897
+ - type: f1
898
+ value: 57.34790386645091
899
+ - task:
900
+ type: Classification
901
+ dataset:
902
+ type: mteb/amazon_massive_intent
903
+ name: MTEB MassiveIntentClassification (pl)
904
+ config: pl
905
+ split: test
906
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
907
+ metrics:
908
+ - type: accuracy
909
+ value: 58.91055817081372
910
+ - type: f1
911
+ value: 56.39195048528157
912
+ - task:
913
+ type: Classification
914
+ dataset:
915
+ type: mteb/amazon_massive_intent
916
+ name: MTEB MassiveIntentClassification (pt)
917
+ config: pt
918
+ split: test
919
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
920
+ metrics:
921
+ - type: accuracy
922
+ value: 59.84196368527236
923
+ - type: f1
924
+ value: 58.72244763127063
925
+ - task:
926
+ type: Classification
927
+ dataset:
928
+ type: mteb/amazon_massive_intent
929
+ name: MTEB MassiveIntentClassification (ro)
930
+ config: ro
931
+ split: test
932
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
933
+ metrics:
934
+ - type: accuracy
935
+ value: 57.04102219233354
936
+ - type: f1
937
+ value: 55.67040186148946
938
+ - task:
939
+ type: Classification
940
+ dataset:
941
+ type: mteb/amazon_massive_intent
942
+ name: MTEB MassiveIntentClassification (ru)
943
+ config: ru
944
+ split: test
945
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
946
+ metrics:
947
+ - type: accuracy
948
+ value: 58.01613987895091
949
+ - type: f1
950
+ value: 57.203949825484855
951
+ - task:
952
+ type: Classification
953
+ dataset:
954
+ type: mteb/amazon_massive_intent
955
+ name: MTEB MassiveIntentClassification (sl)
956
+ config: sl
957
+ split: test
958
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
959
+ metrics:
960
+ - type: accuracy
961
+ value: 56.35843981170141
962
+ - type: f1
963
+ value: 54.18656338999773
964
+ - task:
965
+ type: Classification
966
+ dataset:
967
+ type: mteb/amazon_massive_intent
968
+ name: MTEB MassiveIntentClassification (sq)
969
+ config: sq
970
+ split: test
971
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
972
+ metrics:
973
+ - type: accuracy
974
+ value: 56.47948890383322
975
+ - type: f1
976
+ value: 54.772224557130954
977
+ - task:
978
+ type: Classification
979
+ dataset:
980
+ type: mteb/amazon_massive_intent
981
+ name: MTEB MassiveIntentClassification (sv)
982
+ config: sv
983
+ split: test
984
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
985
+ metrics:
986
+ - type: accuracy
987
+ value: 58.43981170141224
988
+ - type: f1
989
+ value: 56.09260971364242
990
+ - task:
991
+ type: Classification
992
+ dataset:
993
+ type: mteb/amazon_massive_intent
994
+ name: MTEB MassiveIntentClassification (sw)
995
+ config: sw
996
+ split: test
997
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
998
+ metrics:
999
+ - type: accuracy
1000
+ value: 33.9609952925353
1001
+ - type: f1
1002
+ value: 33.18853392353405
1003
+ - task:
1004
+ type: Classification
1005
+ dataset:
1006
+ type: mteb/amazon_massive_intent
1007
+ name: MTEB MassiveIntentClassification (ta)
1008
+ config: ta
1009
+ split: test
1010
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1011
+ metrics:
1012
+ - type: accuracy
1013
+ value: 44.29388029589778
1014
+ - type: f1
1015
+ value: 41.51986533284474
1016
+ - task:
1017
+ type: Classification
1018
+ dataset:
1019
+ type: mteb/amazon_massive_intent
1020
+ name: MTEB MassiveIntentClassification (te)
1021
+ config: te
1022
+ split: test
1023
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1024
+ metrics:
1025
+ - type: accuracy
1026
+ value: 47.13517148621385
1027
+ - type: f1
1028
+ value: 43.94784138379624
1029
+ - task:
1030
+ type: Classification
1031
+ dataset:
1032
+ type: mteb/amazon_massive_intent
1033
+ name: MTEB MassiveIntentClassification (th)
1034
+ config: th
1035
+ split: test
1036
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1037
+ metrics:
1038
+ - type: accuracy
1039
+ value: 56.856086079354405
1040
+ - type: f1
1041
+ value: 56.618177384748456
1042
+ - task:
1043
+ type: Classification
1044
+ dataset:
1045
+ type: mteb/amazon_massive_intent
1046
+ name: MTEB MassiveIntentClassification (tl)
1047
+ config: tl
1048
+ split: test
1049
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1050
+ metrics:
1051
+ - type: accuracy
1052
+ value: 35.35978480161398
1053
+ - type: f1
1054
+ value: 34.060680080365046
1055
+ - task:
1056
+ type: Classification
1057
+ dataset:
1058
+ type: mteb/amazon_massive_intent
1059
+ name: MTEB MassiveIntentClassification (tr)
1060
+ config: tr
1061
+ split: test
1062
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1063
+ metrics:
1064
+ - type: accuracy
1065
+ value: 59.630127774041696
1066
+ - type: f1
1067
+ value: 57.46288652988266
1068
+ - task:
1069
+ type: Classification
1070
+ dataset:
1071
+ type: mteb/amazon_massive_intent
1072
+ name: MTEB MassiveIntentClassification (ur)
1073
+ config: ur
1074
+ split: test
1075
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1076
+ metrics:
1077
+ - type: accuracy
1078
+ value: 52.7908540685945
1079
+ - type: f1
1080
+ value: 51.46934239116157
1081
+ - task:
1082
+ type: Classification
1083
+ dataset:
1084
+ type: mteb/amazon_massive_intent
1085
+ name: MTEB MassiveIntentClassification (vi)
1086
+ config: vi
1087
+ split: test
1088
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1089
+ metrics:
1090
+ - type: accuracy
1091
+ value: 54.6469401479489
1092
+ - type: f1
1093
+ value: 53.9903066185816
1094
+ - task:
1095
+ type: Classification
1096
+ dataset:
1097
+ type: mteb/amazon_massive_intent
1098
+ name: MTEB MassiveIntentClassification (zh-CN)
1099
+ config: zh-CN
1100
+ split: test
1101
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1102
+ metrics:
1103
+ - type: accuracy
1104
+ value: 60.85743106926698
1105
+ - type: f1
1106
+ value: 59.31579548450755
1107
+ - task:
1108
+ type: Classification
1109
+ dataset:
1110
+ type: mteb/amazon_massive_intent
1111
+ name: MTEB MassiveIntentClassification (zh-TW)
1112
+ config: zh-TW
1113
+ split: test
1114
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1115
+ metrics:
1116
+ - type: accuracy
1117
+ value: 57.46805648957633
1118
+ - type: f1
1119
+ value: 57.48469733657326
1120
+ - task:
1121
+ type: Classification
1122
+ dataset:
1123
+ type: mteb/amazon_massive_scenario
1124
+ name: MTEB MassiveScenarioClassification (af)
1125
+ config: af
1126
+ split: test
1127
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1128
+ metrics:
1129
+ - type: accuracy
1130
+ value: 50.86415601882985
1131
+ - type: f1
1132
+ value: 49.41696672602645
1133
+ - task:
1134
+ type: Classification
1135
+ dataset:
1136
+ type: mteb/amazon_massive_scenario
1137
+ name: MTEB MassiveScenarioClassification (am)
1138
+ config: am
1139
+ split: test
1140
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1141
+ metrics:
1142
+ - type: accuracy
1143
+ value: 41.183591123066584
1144
+ - type: f1
1145
+ value: 40.04563865770774
1146
+ - task:
1147
+ type: Classification
1148
+ dataset:
1149
+ type: mteb/amazon_massive_scenario
1150
+ name: MTEB MassiveScenarioClassification (ar)
1151
+ config: ar
1152
+ split: test
1153
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1154
+ metrics:
1155
+ - type: accuracy
1156
+ value: 50.08069939475455
1157
+ - type: f1
1158
+ value: 50.724800165846126
1159
+ - task:
1160
+ type: Classification
1161
+ dataset:
1162
+ type: mteb/amazon_massive_scenario
1163
+ name: MTEB MassiveScenarioClassification (az)
1164
+ config: az
1165
+ split: test
1166
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1167
+ metrics:
1168
+ - type: accuracy
1169
+ value: 51.287827841291204
1170
+ - type: f1
1171
+ value: 50.72873776739851
1172
+ - task:
1173
+ type: Classification
1174
+ dataset:
1175
+ type: mteb/amazon_massive_scenario
1176
+ name: MTEB MassiveScenarioClassification (bn)
1177
+ config: bn
1178
+ split: test
1179
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1180
+ metrics:
1181
+ - type: accuracy
1182
+ value: 46.53328850033624
1183
+ - type: f1
1184
+ value: 45.93317866639667
1185
+ - task:
1186
+ type: Classification
1187
+ dataset:
1188
+ type: mteb/amazon_massive_scenario
1189
+ name: MTEB MassiveScenarioClassification (cy)
1190
+ config: cy
1191
+ split: test
1192
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1193
+ metrics:
1194
+ - type: accuracy
1195
+ value: 34.347679892400805
1196
+ - type: f1
1197
+ value: 31.941581141280828
1198
+ - task:
1199
+ type: Classification
1200
+ dataset:
1201
+ type: mteb/amazon_massive_scenario
1202
+ name: MTEB MassiveScenarioClassification (da)
1203
+ config: da
1204
+ split: test
1205
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1206
+ metrics:
1207
+ - type: accuracy
1208
+ value: 63.073301950235376
1209
+ - type: f1
1210
+ value: 62.228728940111054
1211
+ - task:
1212
+ type: Classification
1213
+ dataset:
1214
+ type: mteb/amazon_massive_scenario
1215
+ name: MTEB MassiveScenarioClassification (de)
1216
+ config: de
1217
+ split: test
1218
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1219
+ metrics:
1220
+ - type: accuracy
1221
+ value: 56.398789509078675
1222
+ - type: f1
1223
+ value: 54.80778341609032
1224
+ - task:
1225
+ type: Classification
1226
+ dataset:
1227
+ type: mteb/amazon_massive_scenario
1228
+ name: MTEB MassiveScenarioClassification (el)
1229
+ config: el
1230
+ split: test
1231
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1232
+ metrics:
1233
+ - type: accuracy
1234
+ value: 61.79892400806993
1235
+ - type: f1
1236
+ value: 60.69430756982446
1237
+ - task:
1238
+ type: Classification
1239
+ dataset:
1240
+ type: mteb/amazon_massive_scenario
1241
+ name: MTEB MassiveScenarioClassification (en)
1242
+ config: en
1243
+ split: test
1244
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1245
+ metrics:
1246
+ - type: accuracy
1247
+ value: 66.96368527236046
1248
+ - type: f1
1249
+ value: 66.5893927997656
1250
+ - task:
1251
+ type: Classification
1252
+ dataset:
1253
+ type: mteb/amazon_massive_scenario
1254
+ name: MTEB MassiveScenarioClassification (es)
1255
+ config: es
1256
+ split: test
1257
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1258
+ metrics:
1259
+ - type: accuracy
1260
+ value: 62.21250840618695
1261
+ - type: f1
1262
+ value: 62.347177794128925
1263
+ - task:
1264
+ type: Classification
1265
+ dataset:
1266
+ type: mteb/amazon_massive_scenario
1267
+ name: MTEB MassiveScenarioClassification (fa)
1268
+ config: fa
1269
+ split: test
1270
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1271
+ metrics:
1272
+ - type: accuracy
1273
+ value: 62.43779421654339
1274
+ - type: f1
1275
+ value: 61.307701312085605
1276
+ - task:
1277
+ type: Classification
1278
+ dataset:
1279
+ type: mteb/amazon_massive_scenario
1280
+ name: MTEB MassiveScenarioClassification (fi)
1281
+ config: fi
1282
+ split: test
1283
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1284
+ metrics:
1285
+ - type: accuracy
1286
+ value: 61.09952925353059
1287
+ - type: f1
1288
+ value: 60.313907927386914
1289
+ - task:
1290
+ type: Classification
1291
+ dataset:
1292
+ type: mteb/amazon_massive_scenario
1293
+ name: MTEB MassiveScenarioClassification (fr)
1294
+ config: fr
1295
+ split: test
1296
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1297
+ metrics:
1298
+ - type: accuracy
1299
+ value: 63.38601210490922
1300
+ - type: f1
1301
+ value: 63.05968938353488
1302
+ - task:
1303
+ type: Classification
1304
+ dataset:
1305
+ type: mteb/amazon_massive_scenario
1306
+ name: MTEB MassiveScenarioClassification (he)
1307
+ config: he
1308
+ split: test
1309
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1310
+ metrics:
1311
+ - type: accuracy
1312
+ value: 56.2878278412912
1313
+ - type: f1
1314
+ value: 55.92927644838597
1315
+ - task:
1316
+ type: Classification
1317
+ dataset:
1318
+ type: mteb/amazon_massive_scenario
1319
+ name: MTEB MassiveScenarioClassification (hi)
1320
+ config: hi
1321
+ split: test
1322
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1323
+ metrics:
1324
+ - type: accuracy
1325
+ value: 60.62878278412912
1326
+ - type: f1
1327
+ value: 60.25299253652635
1328
+ - task:
1329
+ type: Classification
1330
+ dataset:
1331
+ type: mteb/amazon_massive_scenario
1332
+ name: MTEB MassiveScenarioClassification (hu)
1333
+ config: hu
1334
+ split: test
1335
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1336
+ metrics:
1337
+ - type: accuracy
1338
+ value: 63.28850033624748
1339
+ - type: f1
1340
+ value: 62.77053246337031
1341
+ - task:
1342
+ type: Classification
1343
+ dataset:
1344
+ type: mteb/amazon_massive_scenario
1345
+ name: MTEB MassiveScenarioClassification (hy)
1346
+ config: hy
1347
+ split: test
1348
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1349
+ metrics:
1350
+ - type: accuracy
1351
+ value: 54.875588433086754
1352
+ - type: f1
1353
+ value: 54.30717357279134
1354
+ - task:
1355
+ type: Classification
1356
+ dataset:
1357
+ type: mteb/amazon_massive_scenario
1358
+ name: MTEB MassiveScenarioClassification (id)
1359
+ config: id
1360
+ split: test
1361
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1362
+ metrics:
1363
+ - type: accuracy
1364
+ value: 61.99394754539341
1365
+ - type: f1
1366
+ value: 61.73085530883037
1367
+ - task:
1368
+ type: Classification
1369
+ dataset:
1370
+ type: mteb/amazon_massive_scenario
1371
+ name: MTEB MassiveScenarioClassification (is)
1372
+ config: is
1373
+ split: test
1374
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1375
+ metrics:
1376
+ - type: accuracy
1377
+ value: 38.581035642232685
1378
+ - type: f1
1379
+ value: 36.96287269695893
1380
+ - task:
1381
+ type: Classification
1382
+ dataset:
1383
+ type: mteb/amazon_massive_scenario
1384
+ name: MTEB MassiveScenarioClassification (it)
1385
+ config: it
1386
+ split: test
1387
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1388
+ metrics:
1389
+ - type: accuracy
1390
+ value: 62.350369872225976
1391
+ - type: f1
1392
+ value: 61.807327324823966
1393
+ - task:
1394
+ type: Classification
1395
+ dataset:
1396
+ type: mteb/amazon_massive_scenario
1397
+ name: MTEB MassiveScenarioClassification (ja)
1398
+ config: ja
1399
+ split: test
1400
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1401
+ metrics:
1402
+ - type: accuracy
1403
+ value: 65.17148621385338
1404
+ - type: f1
1405
+ value: 65.29620144656751
1406
+ - task:
1407
+ type: Classification
1408
+ dataset:
1409
+ type: mteb/amazon_massive_scenario
1410
+ name: MTEB MassiveScenarioClassification (jv)
1411
+ config: jv
1412
+ split: test
1413
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1414
+ metrics:
1415
+ - type: accuracy
1416
+ value: 36.12642905178212
1417
+ - type: f1
1418
+ value: 35.334393048479484
1419
+ - task:
1420
+ type: Classification
1421
+ dataset:
1422
+ type: mteb/amazon_massive_scenario
1423
+ name: MTEB MassiveScenarioClassification (ka)
1424
+ config: ka
1425
+ split: test
1426
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1427
+ metrics:
1428
+ - type: accuracy
1429
+ value: 50.26899798251513
1430
+ - type: f1
1431
+ value: 49.041065960139434
1432
+ - task:
1433
+ type: Classification
1434
+ dataset:
1435
+ type: mteb/amazon_massive_scenario
1436
+ name: MTEB MassiveScenarioClassification (km)
1437
+ config: km
1438
+ split: test
1439
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1440
+ metrics:
1441
+ - type: accuracy
1442
+ value: 44.24344317417619
1443
+ - type: f1
1444
+ value: 42.42177854872125
1445
+ - task:
1446
+ type: Classification
1447
+ dataset:
1448
+ type: mteb/amazon_massive_scenario
1449
+ name: MTEB MassiveScenarioClassification (kn)
1450
+ config: kn
1451
+ split: test
1452
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1453
+ metrics:
1454
+ - type: accuracy
1455
+ value: 47.370544720914594
1456
+ - type: f1
1457
+ value: 46.589722581465324
1458
+ - task:
1459
+ type: Classification
1460
+ dataset:
1461
+ type: mteb/amazon_massive_scenario
1462
+ name: MTEB MassiveScenarioClassification (ko)
1463
+ config: ko
1464
+ split: test
1465
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1466
+ metrics:
1467
+ - type: accuracy
1468
+ value: 58.89038332212508
1469
+ - type: f1
1470
+ value: 57.753607921990394
1471
+ - task:
1472
+ type: Classification
1473
+ dataset:
1474
+ type: mteb/amazon_massive_scenario
1475
+ name: MTEB MassiveScenarioClassification (lv)
1476
+ config: lv
1477
+ split: test
1478
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1479
+ metrics:
1480
+ - type: accuracy
1481
+ value: 56.506388702084756
1482
+ - type: f1
1483
+ value: 56.0485860423295
1484
+ - task:
1485
+ type: Classification
1486
+ dataset:
1487
+ type: mteb/amazon_massive_scenario
1488
+ name: MTEB MassiveScenarioClassification (ml)
1489
+ config: ml
1490
+ split: test
1491
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1492
+ metrics:
1493
+ - type: accuracy
1494
+ value: 50.06388702084734
1495
+ - type: f1
1496
+ value: 50.109364641824584
1497
+ - task:
1498
+ type: Classification
1499
+ dataset:
1500
+ type: mteb/amazon_massive_scenario
1501
+ name: MTEB MassiveScenarioClassification (mn)
1502
+ config: mn
1503
+ split: test
1504
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1505
+ metrics:
1506
+ - type: accuracy
1507
+ value: 55.053799596503026
1508
+ - type: f1
1509
+ value: 54.490665705666686
1510
+ - task:
1511
+ type: Classification
1512
+ dataset:
1513
+ type: mteb/amazon_massive_scenario
1514
+ name: MTEB MassiveScenarioClassification (ms)
1515
+ config: ms
1516
+ split: test
1517
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1518
+ metrics:
1519
+ - type: accuracy
1520
+ value: 59.77135171486213
1521
+ - type: f1
1522
+ value: 58.2808650158803
1523
+ - task:
1524
+ type: Classification
1525
+ dataset:
1526
+ type: mteb/amazon_massive_scenario
1527
+ name: MTEB MassiveScenarioClassification (my)
1528
+ config: my
1529
+ split: test
1530
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1531
+ metrics:
1532
+ - type: accuracy
1533
+ value: 55.71620712844654
1534
+ - type: f1
1535
+ value: 53.863034882475304
1536
+ - task:
1537
+ type: Classification
1538
+ dataset:
1539
+ type: mteb/amazon_massive_scenario
1540
+ name: MTEB MassiveScenarioClassification (nb)
1541
+ config: nb
1542
+ split: test
1543
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1544
+ metrics:
1545
+ - type: accuracy
1546
+ value: 60.26227303295225
1547
+ - type: f1
1548
+ value: 59.86604657147016
1549
+ - task:
1550
+ type: Classification
1551
+ dataset:
1552
+ type: mteb/amazon_massive_scenario
1553
+ name: MTEB MassiveScenarioClassification (nl)
1554
+ config: nl
1555
+ split: test
1556
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1557
+ metrics:
1558
+ - type: accuracy
1559
+ value: 63.3759246805649
1560
+ - type: f1
1561
+ value: 62.45257339288533
1562
+ - task:
1563
+ type: Classification
1564
+ dataset:
1565
+ type: mteb/amazon_massive_scenario
1566
+ name: MTEB MassiveScenarioClassification (pl)
1567
+ config: pl
1568
+ split: test
1569
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1570
+ metrics:
1571
+ - type: accuracy
1572
+ value: 62.552118359112306
1573
+ - type: f1
1574
+ value: 61.354449605776765
1575
+ - task:
1576
+ type: Classification
1577
+ dataset:
1578
+ type: mteb/amazon_massive_scenario
1579
+ name: MTEB MassiveScenarioClassification (pt)
1580
+ config: pt
1581
+ split: test
1582
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1583
+ metrics:
1584
+ - type: accuracy
1585
+ value: 62.40753194351043
1586
+ - type: f1
1587
+ value: 61.98779889528889
1588
+ - task:
1589
+ type: Classification
1590
+ dataset:
1591
+ type: mteb/amazon_massive_scenario
1592
+ name: MTEB MassiveScenarioClassification (ro)
1593
+ config: ro
1594
+ split: test
1595
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1596
+ metrics:
1597
+ - type: accuracy
1598
+ value: 60.68258238063214
1599
+ - type: f1
1600
+ value: 60.59973978976571
1601
+ - task:
1602
+ type: Classification
1603
+ dataset:
1604
+ type: mteb/amazon_massive_scenario
1605
+ name: MTEB MassiveScenarioClassification (ru)
1606
+ config: ru
1607
+ split: test
1608
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1609
+ metrics:
1610
+ - type: accuracy
1611
+ value: 62.31002017484868
1612
+ - type: f1
1613
+ value: 62.412312268503655
1614
+ - task:
1615
+ type: Classification
1616
+ dataset:
1617
+ type: mteb/amazon_massive_scenario
1618
+ name: MTEB MassiveScenarioClassification (sl)
1619
+ config: sl
1620
+ split: test
1621
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1622
+ metrics:
1623
+ - type: accuracy
1624
+ value: 61.429051782111635
1625
+ - type: f1
1626
+ value: 61.60095590401424
1627
+ - task:
1628
+ type: Classification
1629
+ dataset:
1630
+ type: mteb/amazon_massive_scenario
1631
+ name: MTEB MassiveScenarioClassification (sq)
1632
+ config: sq
1633
+ split: test
1634
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1635
+ metrics:
1636
+ - type: accuracy
1637
+ value: 62.229320780094156
1638
+ - type: f1
1639
+ value: 61.02251426747547
1640
+ - task:
1641
+ type: Classification
1642
+ dataset:
1643
+ type: mteb/amazon_massive_scenario
1644
+ name: MTEB MassiveScenarioClassification (sv)
1645
+ config: sv
1646
+ split: test
1647
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1648
+ metrics:
1649
+ - type: accuracy
1650
+ value: 64.42501681237391
1651
+ - type: f1
1652
+ value: 63.461494430605235
1653
+ - task:
1654
+ type: Classification
1655
+ dataset:
1656
+ type: mteb/amazon_massive_scenario
1657
+ name: MTEB MassiveScenarioClassification (sw)
1658
+ config: sw
1659
+ split: test
1660
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1661
+ metrics:
1662
+ - type: accuracy
1663
+ value: 38.51714862138534
1664
+ - type: f1
1665
+ value: 37.12466722986362
1666
+ - task:
1667
+ type: Classification
1668
+ dataset:
1669
+ type: mteb/amazon_massive_scenario
1670
+ name: MTEB MassiveScenarioClassification (ta)
1671
+ config: ta
1672
+ split: test
1673
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1674
+ metrics:
1675
+ - type: accuracy
1676
+ value: 46.99731002017485
1677
+ - type: f1
1678
+ value: 45.859147049984834
1679
+ - task:
1680
+ type: Classification
1681
+ dataset:
1682
+ type: mteb/amazon_massive_scenario
1683
+ name: MTEB MassiveScenarioClassification (te)
1684
+ config: te
1685
+ split: test
1686
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1687
+ metrics:
1688
+ - type: accuracy
1689
+ value: 51.01882985877605
1690
+ - type: f1
1691
+ value: 49.01040173136056
1692
+ - task:
1693
+ type: Classification
1694
+ dataset:
1695
+ type: mteb/amazon_massive_scenario
1696
+ name: MTEB MassiveScenarioClassification (th)
1697
+ config: th
1698
+ split: test
1699
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1700
+ metrics:
1701
+ - type: accuracy
1702
+ value: 63.234700739744454
1703
+ - type: f1
1704
+ value: 62.732294595214746
1705
+ - task:
1706
+ type: Classification
1707
+ dataset:
1708
+ type: mteb/amazon_massive_scenario
1709
+ name: MTEB MassiveScenarioClassification (tl)
1710
+ config: tl
1711
+ split: test
1712
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1713
+ metrics:
1714
+ - type: accuracy
1715
+ value: 38.72225958305312
1716
+ - type: f1
1717
+ value: 36.603231928120906
1718
+ - task:
1719
+ type: Classification
1720
+ dataset:
1721
+ type: mteb/amazon_massive_scenario
1722
+ name: MTEB MassiveScenarioClassification (tr)
1723
+ config: tr
1724
+ split: test
1725
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1726
+ metrics:
1727
+ - type: accuracy
1728
+ value: 64.48554135843982
1729
+ - type: f1
1730
+ value: 63.97380562022752
1731
+ - task:
1732
+ type: Classification
1733
+ dataset:
1734
+ type: mteb/amazon_massive_scenario
1735
+ name: MTEB MassiveScenarioClassification (ur)
1736
+ config: ur
1737
+ split: test
1738
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1739
+ metrics:
1740
+ - type: accuracy
1741
+ value: 56.7955615332885
1742
+ - type: f1
1743
+ value: 55.95308241204802
1744
+ - task:
1745
+ type: Classification
1746
+ dataset:
1747
+ type: mteb/amazon_massive_scenario
1748
+ name: MTEB MassiveScenarioClassification (vi)
1749
+ config: vi
1750
+ split: test
1751
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1752
+ metrics:
1753
+ - type: accuracy
1754
+ value: 57.06455951580362
1755
+ - type: f1
1756
+ value: 56.95570494066693
1757
+ - task:
1758
+ type: Classification
1759
+ dataset:
1760
+ type: mteb/amazon_massive_scenario
1761
+ name: MTEB MassiveScenarioClassification (zh-CN)
1762
+ config: zh-CN
1763
+ split: test
1764
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1765
+ metrics:
1766
+ - type: accuracy
1767
+ value: 65.8338937457969
1768
+ - type: f1
1769
+ value: 65.6778746906008
1770
+ - task:
1771
+ type: Classification
1772
+ dataset:
1773
+ type: mteb/amazon_massive_scenario
1774
+ name: MTEB MassiveScenarioClassification (zh-TW)
1775
+ config: zh-TW
1776
+ split: test
1777
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1778
+ metrics:
1779
+ - type: accuracy
1780
+ value: 63.369199731002034
1781
+ - type: f1
1782
+ value: 63.527650116059945
1783
+ - task:
1784
+ type: Clustering
1785
+ dataset:
1786
+ type: mteb/medrxiv-clustering-p2p
1787
+ name: MTEB MedrxivClusteringP2P
1788
+ config: default
1789
+ split: test
1790
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1791
+ metrics:
1792
+ - type: v_measure
1793
+ value: 29.442504112215538
1794
+ - task:
1795
+ type: Clustering
1796
+ dataset:
1797
+ type: mteb/medrxiv-clustering-s2s
1798
+ name: MTEB MedrxivClusteringS2S
1799
+ config: default
1800
+ split: test
1801
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1802
+ metrics:
1803
+ - type: v_measure
1804
+ value: 26.16062814161053
1805
+ - task:
1806
+ type: Retrieval
1807
+ dataset:
1808
+ type: quora
1809
+ name: MTEB QuoraRetrieval
1810
+ config: default
1811
+ split: test
1812
+ revision: None
1813
+ metrics:
1814
+ - type: map_at_1
1815
+ value: 65.319
1816
+ - type: map_at_10
1817
+ value: 78.72
1818
+ - type: map_at_100
1819
+ value: 79.44600000000001
1820
+ - type: map_at_1000
1821
+ value: 79.469
1822
+ - type: map_at_3
1823
+ value: 75.693
1824
+ - type: map_at_5
1825
+ value: 77.537
1826
+ - type: mrr_at_1
1827
+ value: 75.24
1828
+ - type: mrr_at_10
1829
+ value: 82.304
1830
+ - type: mrr_at_100
1831
+ value: 82.485
1832
+ - type: mrr_at_1000
1833
+ value: 82.489
1834
+ - type: mrr_at_3
1835
+ value: 81.002
1836
+ - type: mrr_at_5
1837
+ value: 81.817
1838
+ - type: ndcg_at_1
1839
+ value: 75.26
1840
+ - type: ndcg_at_10
1841
+ value: 83.07
1842
+ - type: ndcg_at_100
1843
+ value: 84.829
1844
+ - type: ndcg_at_1000
1845
+ value: 85.087
1846
+ - type: ndcg_at_3
1847
+ value: 79.67699999999999
1848
+ - type: ndcg_at_5
1849
+ value: 81.42
1850
+ - type: precision_at_1
1851
+ value: 75.26
1852
+ - type: precision_at_10
1853
+ value: 12.697
1854
+ - type: precision_at_100
1855
+ value: 1.4829999999999999
1856
+ - type: precision_at_1000
1857
+ value: 0.154
1858
+ - type: precision_at_3
1859
+ value: 34.849999999999994
1860
+ - type: precision_at_5
1861
+ value: 23.054
1862
+ - type: recall_at_1
1863
+ value: 65.319
1864
+ - type: recall_at_10
1865
+ value: 91.551
1866
+ - type: recall_at_100
1867
+ value: 98.053
1868
+ - type: recall_at_1000
1869
+ value: 99.516
1870
+ - type: recall_at_3
1871
+ value: 81.819
1872
+ - type: recall_at_5
1873
+ value: 86.66199999999999
1874
+ - task:
1875
+ type: Clustering
1876
+ dataset:
1877
+ type: mteb/reddit-clustering
1878
+ name: MTEB RedditClustering
1879
+ config: default
1880
+ split: test
1881
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
1882
+ metrics:
1883
+ - type: v_measure
1884
+ value: 31.249791587189996
1885
+ - task:
1886
+ type: Clustering
1887
+ dataset:
1888
+ type: mteb/reddit-clustering-p2p
1889
+ name: MTEB RedditClusteringP2P
1890
+ config: default
1891
+ split: test
1892
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
1893
+ metrics:
1894
+ - type: v_measure
1895
+ value: 43.302922383029816
1896
+ - task:
1897
+ type: STS
1898
+ dataset:
1899
+ type: mteb/sickr-sts
1900
+ name: MTEB SICK-R
1901
+ config: default
1902
+ split: test
1903
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
1904
+ metrics:
1905
+ - type: cos_sim_pearson
1906
+ value: 84.80670811345861
1907
+ - type: cos_sim_spearman
1908
+ value: 79.97373018384307
1909
+ - type: euclidean_pearson
1910
+ value: 83.40205934125837
1911
+ - type: euclidean_spearman
1912
+ value: 79.73331008251854
1913
+ - type: manhattan_pearson
1914
+ value: 83.3320983393412
1915
+ - type: manhattan_spearman
1916
+ value: 79.677919746045
1917
+ - task:
1918
+ type: STS
1919
+ dataset:
1920
+ type: mteb/sts12-sts
1921
+ name: MTEB STS12
1922
+ config: default
1923
+ split: test
1924
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
1925
+ metrics:
1926
+ - type: cos_sim_pearson
1927
+ value: 86.3816087627948
1928
+ - type: cos_sim_spearman
1929
+ value: 80.91314664846955
1930
+ - type: euclidean_pearson
1931
+ value: 85.10603071031096
1932
+ - type: euclidean_spearman
1933
+ value: 79.42663939501841
1934
+ - type: manhattan_pearson
1935
+ value: 85.16096376014066
1936
+ - type: manhattan_spearman
1937
+ value: 79.51936545543191
1938
+ - task:
1939
+ type: STS
1940
+ dataset:
1941
+ type: mteb/sts13-sts
1942
+ name: MTEB STS13
1943
+ config: default
1944
+ split: test
1945
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
1946
+ metrics:
1947
+ - type: cos_sim_pearson
1948
+ value: 80.44665329940209
1949
+ - type: cos_sim_spearman
1950
+ value: 82.86479010707745
1951
+ - type: euclidean_pearson
1952
+ value: 84.06719627734672
1953
+ - type: euclidean_spearman
1954
+ value: 84.9356099976297
1955
+ - type: manhattan_pearson
1956
+ value: 84.10370009572624
1957
+ - type: manhattan_spearman
1958
+ value: 84.96828040546536
1959
+ - task:
1960
+ type: STS
1961
+ dataset:
1962
+ type: mteb/sts14-sts
1963
+ name: MTEB STS14
1964
+ config: default
1965
+ split: test
1966
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
1967
+ metrics:
1968
+ - type: cos_sim_pearson
1969
+ value: 86.05704260568437
1970
+ - type: cos_sim_spearman
1971
+ value: 87.36399473803172
1972
+ - type: euclidean_pearson
1973
+ value: 86.8895170159388
1974
+ - type: euclidean_spearman
1975
+ value: 87.16246440866921
1976
+ - type: manhattan_pearson
1977
+ value: 86.80814774538997
1978
+ - type: manhattan_spearman
1979
+ value: 87.09320142699522
1980
+ - task:
1981
+ type: STS
1982
+ dataset:
1983
+ type: mteb/sts15-sts
1984
+ name: MTEB STS15
1985
+ config: default
1986
+ split: test
1987
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
1988
+ metrics:
1989
+ - type: cos_sim_pearson
1990
+ value: 85.97825118945852
1991
+ - type: cos_sim_spearman
1992
+ value: 88.31438033558268
1993
+ - type: euclidean_pearson
1994
+ value: 87.05174694758092
1995
+ - type: euclidean_spearman
1996
+ value: 87.80659468392355
1997
+ - type: manhattan_pearson
1998
+ value: 86.98831322198717
1999
+ - type: manhattan_spearman
2000
+ value: 87.72820615049285
2001
+ - task:
2002
+ type: STS
2003
+ dataset:
2004
+ type: mteb/sts16-sts
2005
+ name: MTEB STS16
2006
+ config: default
2007
+ split: test
2008
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2009
+ metrics:
2010
+ - type: cos_sim_pearson
2011
+ value: 78.68745420126719
2012
+ - type: cos_sim_spearman
2013
+ value: 81.6058424699445
2014
+ - type: euclidean_pearson
2015
+ value: 81.16540133861879
2016
+ - type: euclidean_spearman
2017
+ value: 81.86377535458067
2018
+ - type: manhattan_pearson
2019
+ value: 81.13813317937021
2020
+ - type: manhattan_spearman
2021
+ value: 81.87079962857256
2022
+ - task:
2023
+ type: STS
2024
+ dataset:
2025
+ type: mteb/sts17-crosslingual-sts
2026
+ name: MTEB STS17 (ko-ko)
2027
+ config: ko-ko
2028
+ split: test
2029
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2030
+ metrics:
2031
+ - type: cos_sim_pearson
2032
+ value: 68.06192660936868
2033
+ - type: cos_sim_spearman
2034
+ value: 68.2376353514075
2035
+ - type: euclidean_pearson
2036
+ value: 60.68326946956215
2037
+ - type: euclidean_spearman
2038
+ value: 59.19352349785952
2039
+ - type: manhattan_pearson
2040
+ value: 60.6592944683418
2041
+ - type: manhattan_spearman
2042
+ value: 59.167534419270865
2043
+ - task:
2044
+ type: STS
2045
+ dataset:
2046
+ type: mteb/sts17-crosslingual-sts
2047
+ name: MTEB STS17 (ar-ar)
2048
+ config: ar-ar
2049
+ split: test
2050
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2051
+ metrics:
2052
+ - type: cos_sim_pearson
2053
+ value: 76.78098264855684
2054
+ - type: cos_sim_spearman
2055
+ value: 78.02670452969812
2056
+ - type: euclidean_pearson
2057
+ value: 77.26694463661255
2058
+ - type: euclidean_spearman
2059
+ value: 77.47007626009587
2060
+ - type: manhattan_pearson
2061
+ value: 77.25070088632027
2062
+ - type: manhattan_spearman
2063
+ value: 77.36368265830724
2064
+ - task:
2065
+ type: STS
2066
+ dataset:
2067
+ type: mteb/sts17-crosslingual-sts
2068
+ name: MTEB STS17 (en-ar)
2069
+ config: en-ar
2070
+ split: test
2071
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2072
+ metrics:
2073
+ - type: cos_sim_pearson
2074
+ value: 78.45418506379532
2075
+ - type: cos_sim_spearman
2076
+ value: 78.60412019902428
2077
+ - type: euclidean_pearson
2078
+ value: 79.90303710850512
2079
+ - type: euclidean_spearman
2080
+ value: 78.67123625004957
2081
+ - type: manhattan_pearson
2082
+ value: 80.09189580897753
2083
+ - type: manhattan_spearman
2084
+ value: 79.02484481441483
2085
+ - task:
2086
+ type: STS
2087
+ dataset:
2088
+ type: mteb/sts17-crosslingual-sts
2089
+ name: MTEB STS17 (en-de)
2090
+ config: en-de
2091
+ split: test
2092
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2093
+ metrics:
2094
+ - type: cos_sim_pearson
2095
+ value: 82.35556731232779
2096
+ - type: cos_sim_spearman
2097
+ value: 81.48249735354844
2098
+ - type: euclidean_pearson
2099
+ value: 81.66748026636621
2100
+ - type: euclidean_spearman
2101
+ value: 80.35571574338547
2102
+ - type: manhattan_pearson
2103
+ value: 81.38214732806365
2104
+ - type: manhattan_spearman
2105
+ value: 79.9018202958774
2106
+ - task:
2107
+ type: STS
2108
+ dataset:
2109
+ type: mteb/sts17-crosslingual-sts
2110
+ name: MTEB STS17 (en-en)
2111
+ config: en-en
2112
+ split: test
2113
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2114
+ metrics:
2115
+ - type: cos_sim_pearson
2116
+ value: 86.4527703176897
2117
+ - type: cos_sim_spearman
2118
+ value: 85.81084095829584
2119
+ - type: euclidean_pearson
2120
+ value: 86.43489162324457
2121
+ - type: euclidean_spearman
2122
+ value: 85.27110976093296
2123
+ - type: manhattan_pearson
2124
+ value: 86.43674259444512
2125
+ - type: manhattan_spearman
2126
+ value: 85.05719308026032
2127
+ - task:
2128
+ type: STS
2129
+ dataset:
2130
+ type: mteb/sts17-crosslingual-sts
2131
+ name: MTEB STS17 (en-tr)
2132
+ config: en-tr
2133
+ split: test
2134
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2135
+ metrics:
2136
+ - type: cos_sim_pearson
2137
+ value: 76.00411240034492
2138
+ - type: cos_sim_spearman
2139
+ value: 76.33887356560854
2140
+ - type: euclidean_pearson
2141
+ value: 76.81730660019446
2142
+ - type: euclidean_spearman
2143
+ value: 75.04432185451306
2144
+ - type: manhattan_pearson
2145
+ value: 77.22298813168995
2146
+ - type: manhattan_spearman
2147
+ value: 75.56420330256725
2148
+ - task:
2149
+ type: STS
2150
+ dataset:
2151
+ type: mteb/sts17-crosslingual-sts
2152
+ name: MTEB STS17 (es-en)
2153
+ config: es-en
2154
+ split: test
2155
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2156
+ metrics:
2157
+ - type: cos_sim_pearson
2158
+ value: 79.1447136836213
2159
+ - type: cos_sim_spearman
2160
+ value: 81.80823850788917
2161
+ - type: euclidean_pearson
2162
+ value: 80.84505734814422
2163
+ - type: euclidean_spearman
2164
+ value: 81.714168092736
2165
+ - type: manhattan_pearson
2166
+ value: 80.84713816174187
2167
+ - type: manhattan_spearman
2168
+ value: 81.61267814749516
2169
+ - task:
2170
+ type: STS
2171
+ dataset:
2172
+ type: mteb/sts17-crosslingual-sts
2173
+ name: MTEB STS17 (es-es)
2174
+ config: es-es
2175
+ split: test
2176
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2177
+ metrics:
2178
+ - type: cos_sim_pearson
2179
+ value: 87.01257457052873
2180
+ - type: cos_sim_spearman
2181
+ value: 87.91146458004216
2182
+ - type: euclidean_pearson
2183
+ value: 88.36771859717994
2184
+ - type: euclidean_spearman
2185
+ value: 87.73182474597515
2186
+ - type: manhattan_pearson
2187
+ value: 88.26551451003671
2188
+ - type: manhattan_spearman
2189
+ value: 87.71675151388992
2190
+ - task:
2191
+ type: STS
2192
+ dataset:
2193
+ type: mteb/sts17-crosslingual-sts
2194
+ name: MTEB STS17 (fr-en)
2195
+ config: fr-en
2196
+ split: test
2197
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2198
+ metrics:
2199
+ - type: cos_sim_pearson
2200
+ value: 79.20121618382373
2201
+ - type: cos_sim_spearman
2202
+ value: 78.05794691968603
2203
+ - type: euclidean_pearson
2204
+ value: 79.93819925682054
2205
+ - type: euclidean_spearman
2206
+ value: 78.00586118701553
2207
+ - type: manhattan_pearson
2208
+ value: 80.05598625820885
2209
+ - type: manhattan_spearman
2210
+ value: 78.04802948866832
2211
+ - task:
2212
+ type: STS
2213
+ dataset:
2214
+ type: mteb/sts17-crosslingual-sts
2215
+ name: MTEB STS17 (it-en)
2216
+ config: it-en
2217
+ split: test
2218
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2219
+ metrics:
2220
+ - type: cos_sim_pearson
2221
+ value: 81.51743373871778
2222
+ - type: cos_sim_spearman
2223
+ value: 80.98266651818703
2224
+ - type: euclidean_pearson
2225
+ value: 81.11875722505269
2226
+ - type: euclidean_spearman
2227
+ value: 79.45188413284538
2228
+ - type: manhattan_pearson
2229
+ value: 80.7988457619225
2230
+ - type: manhattan_spearman
2231
+ value: 79.49643569311485
2232
+ - task:
2233
+ type: STS
2234
+ dataset:
2235
+ type: mteb/sts17-crosslingual-sts
2236
+ name: MTEB STS17 (nl-en)
2237
+ config: nl-en
2238
+ split: test
2239
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2240
+ metrics:
2241
+ - type: cos_sim_pearson
2242
+ value: 81.78679924046351
2243
+ - type: cos_sim_spearman
2244
+ value: 80.9986574147117
2245
+ - type: euclidean_pearson
2246
+ value: 82.09130079135713
2247
+ - type: euclidean_spearman
2248
+ value: 80.66215667390159
2249
+ - type: manhattan_pearson
2250
+ value: 82.0328610549654
2251
+ - type: manhattan_spearman
2252
+ value: 80.31047226932408
2253
+ - task:
2254
+ type: STS
2255
+ dataset:
2256
+ type: mteb/sts22-crosslingual-sts
2257
+ name: MTEB STS22 (en)
2258
+ config: en
2259
+ split: test
2260
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2261
+ metrics:
2262
+ - type: cos_sim_pearson
2263
+ value: 58.08082172994642
2264
+ - type: cos_sim_spearman
2265
+ value: 62.9940530222459
2266
+ - type: euclidean_pearson
2267
+ value: 58.47927303460365
2268
+ - type: euclidean_spearman
2269
+ value: 60.8440317609258
2270
+ - type: manhattan_pearson
2271
+ value: 58.32438211697841
2272
+ - type: manhattan_spearman
2273
+ value: 60.69642636776064
2274
+ - task:
2275
+ type: STS
2276
+ dataset:
2277
+ type: mteb/sts22-crosslingual-sts
2278
+ name: MTEB STS22 (de)
2279
+ config: de
2280
+ split: test
2281
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2282
+ metrics:
2283
+ - type: cos_sim_pearson
2284
+ value: 33.83985707464123
2285
+ - type: cos_sim_spearman
2286
+ value: 46.89093209603036
2287
+ - type: euclidean_pearson
2288
+ value: 34.63602187576556
2289
+ - type: euclidean_spearman
2290
+ value: 46.31087228200712
2291
+ - type: manhattan_pearson
2292
+ value: 34.66899391543166
2293
+ - type: manhattan_spearman
2294
+ value: 46.33049538425276
2295
+ - task:
2296
+ type: STS
2297
+ dataset:
2298
+ type: mteb/sts22-crosslingual-sts
2299
+ name: MTEB STS22 (es)
2300
+ config: es
2301
+ split: test
2302
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2303
+ metrics:
2304
+ - type: cos_sim_pearson
2305
+ value: 51.61315965767736
2306
+ - type: cos_sim_spearman
2307
+ value: 58.9434266730386
2308
+ - type: euclidean_pearson
2309
+ value: 50.35885602217862
2310
+ - type: euclidean_spearman
2311
+ value: 58.238679883286025
2312
+ - type: manhattan_pearson
2313
+ value: 53.01732044381151
2314
+ - type: manhattan_spearman
2315
+ value: 58.10482351761412
2316
+ - task:
2317
+ type: STS
2318
+ dataset:
2319
+ type: mteb/sts22-crosslingual-sts
2320
+ name: MTEB STS22 (pl)
2321
+ config: pl
2322
+ split: test
2323
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2324
+ metrics:
2325
+ - type: cos_sim_pearson
2326
+ value: 26.771738440430177
2327
+ - type: cos_sim_spearman
2328
+ value: 34.807259227816054
2329
+ - type: euclidean_pearson
2330
+ value: 17.82657835823811
2331
+ - type: euclidean_spearman
2332
+ value: 34.27912898498941
2333
+ - type: manhattan_pearson
2334
+ value: 19.121527758886312
2335
+ - type: manhattan_spearman
2336
+ value: 34.4940050226265
2337
+ - task:
2338
+ type: STS
2339
+ dataset:
2340
+ type: mteb/sts22-crosslingual-sts
2341
+ name: MTEB STS22 (tr)
2342
+ config: tr
2343
+ split: test
2344
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2345
+ metrics:
2346
+ - type: cos_sim_pearson
2347
+ value: 52.8354704676683
2348
+ - type: cos_sim_spearman
2349
+ value: 57.28629534815841
2350
+ - type: euclidean_pearson
2351
+ value: 54.10329332004385
2352
+ - type: euclidean_spearman
2353
+ value: 58.15030615859976
2354
+ - type: manhattan_pearson
2355
+ value: 55.42372087433115
2356
+ - type: manhattan_spearman
2357
+ value: 57.52270736584036
2358
+ - task:
2359
+ type: STS
2360
+ dataset:
2361
+ type: mteb/sts22-crosslingual-sts
2362
+ name: MTEB STS22 (ar)
2363
+ config: ar
2364
+ split: test
2365
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2366
+ metrics:
2367
+ - type: cos_sim_pearson
2368
+ value: 31.01976557986924
2369
+ - type: cos_sim_spearman
2370
+ value: 54.506959483927616
2371
+ - type: euclidean_pearson
2372
+ value: 36.917863022119086
2373
+ - type: euclidean_spearman
2374
+ value: 53.750194241538566
2375
+ - type: manhattan_pearson
2376
+ value: 37.200177833241085
2377
+ - type: manhattan_spearman
2378
+ value: 53.507659188082535
2379
+ - task:
2380
+ type: STS
2381
+ dataset:
2382
+ type: mteb/sts22-crosslingual-sts
2383
+ name: MTEB STS22 (ru)
2384
+ config: ru
2385
+ split: test
2386
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2387
+ metrics:
2388
+ - type: cos_sim_pearson
2389
+ value: 46.38635647225934
2390
+ - type: cos_sim_spearman
2391
+ value: 54.50892732637536
2392
+ - type: euclidean_pearson
2393
+ value: 40.8331015184763
2394
+ - type: euclidean_spearman
2395
+ value: 53.142903182230924
2396
+ - type: manhattan_pearson
2397
+ value: 43.07655692906317
2398
+ - type: manhattan_spearman
2399
+ value: 53.5833474125901
2400
+ - task:
2401
+ type: STS
2402
+ dataset:
2403
+ type: mteb/sts22-crosslingual-sts
2404
+ name: MTEB STS22 (zh)
2405
+ config: zh
2406
+ split: test
2407
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2408
+ metrics:
2409
+ - type: cos_sim_pearson
2410
+ value: 60.52525456662916
2411
+ - type: cos_sim_spearman
2412
+ value: 63.23975489531082
2413
+ - type: euclidean_pearson
2414
+ value: 58.989191722317514
2415
+ - type: euclidean_spearman
2416
+ value: 62.536326639863894
2417
+ - type: manhattan_pearson
2418
+ value: 61.32982866201855
2419
+ - type: manhattan_spearman
2420
+ value: 63.068262822520516
2421
+ - task:
2422
+ type: STS
2423
+ dataset:
2424
+ type: mteb/sts22-crosslingual-sts
2425
+ name: MTEB STS22 (fr)
2426
+ config: fr
2427
+ split: test
2428
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2429
+ metrics:
2430
+ - type: cos_sim_pearson
2431
+ value: 59.63798684577696
2432
+ - type: cos_sim_spearman
2433
+ value: 74.09937723367189
2434
+ - type: euclidean_pearson
2435
+ value: 63.77494904383906
2436
+ - type: euclidean_spearman
2437
+ value: 71.15932571292481
2438
+ - type: manhattan_pearson
2439
+ value: 63.69646122775205
2440
+ - type: manhattan_spearman
2441
+ value: 70.54960698541632
2442
+ - task:
2443
+ type: STS
2444
+ dataset:
2445
+ type: mteb/sts22-crosslingual-sts
2446
+ name: MTEB STS22 (de-en)
2447
+ config: de-en
2448
+ split: test
2449
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2450
+ metrics:
2451
+ - type: cos_sim_pearson
2452
+ value: 36.50262468726711
2453
+ - type: cos_sim_spearman
2454
+ value: 45.00322499674274
2455
+ - type: euclidean_pearson
2456
+ value: 32.58759216581778
2457
+ - type: euclidean_spearman
2458
+ value: 40.13720951315429
2459
+ - type: manhattan_pearson
2460
+ value: 34.88422299605277
2461
+ - type: manhattan_spearman
2462
+ value: 40.63516862200963
2463
+ - task:
2464
+ type: STS
2465
+ dataset:
2466
+ type: mteb/sts22-crosslingual-sts
2467
+ name: MTEB STS22 (es-en)
2468
+ config: es-en
2469
+ split: test
2470
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2471
+ metrics:
2472
+ - type: cos_sim_pearson
2473
+ value: 56.498552617040275
2474
+ - type: cos_sim_spearman
2475
+ value: 67.71358426124443
2476
+ - type: euclidean_pearson
2477
+ value: 57.16474781778287
2478
+ - type: euclidean_spearman
2479
+ value: 65.721515493531
2480
+ - type: manhattan_pearson
2481
+ value: 59.25227610738926
2482
+ - type: manhattan_spearman
2483
+ value: 65.89743680340739
2484
+ - task:
2485
+ type: STS
2486
+ dataset:
2487
+ type: mteb/sts22-crosslingual-sts
2488
+ name: MTEB STS22 (it)
2489
+ config: it
2490
+ split: test
2491
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2492
+ metrics:
2493
+ - type: cos_sim_pearson
2494
+ value: 55.97978814727984
2495
+ - type: cos_sim_spearman
2496
+ value: 65.85821395092104
2497
+ - type: euclidean_pearson
2498
+ value: 59.11117270978519
2499
+ - type: euclidean_spearman
2500
+ value: 64.50062069934965
2501
+ - type: manhattan_pearson
2502
+ value: 59.4436213778161
2503
+ - type: manhattan_spearman
2504
+ value: 64.4003273074382
2505
+ - task:
2506
+ type: STS
2507
+ dataset:
2508
+ type: mteb/sts22-crosslingual-sts
2509
+ name: MTEB STS22 (pl-en)
2510
+ config: pl-en
2511
+ split: test
2512
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2513
+ metrics:
2514
+ - type: cos_sim_pearson
2515
+ value: 58.00873192515712
2516
+ - type: cos_sim_spearman
2517
+ value: 60.167708809138745
2518
+ - type: euclidean_pearson
2519
+ value: 56.91950637760252
2520
+ - type: euclidean_spearman
2521
+ value: 58.50593399441014
2522
+ - type: manhattan_pearson
2523
+ value: 58.683747352584994
2524
+ - type: manhattan_spearman
2525
+ value: 59.38110066799761
2526
+ - task:
2527
+ type: STS
2528
+ dataset:
2529
+ type: mteb/sts22-crosslingual-sts
2530
+ name: MTEB STS22 (zh-en)
2531
+ config: zh-en
2532
+ split: test
2533
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2534
+ metrics:
2535
+ - type: cos_sim_pearson
2536
+ value: 54.26020658151187
2537
+ - type: cos_sim_spearman
2538
+ value: 61.29236187204147
2539
+ - type: euclidean_pearson
2540
+ value: 55.993896804147056
2541
+ - type: euclidean_spearman
2542
+ value: 58.654928232615354
2543
+ - type: manhattan_pearson
2544
+ value: 56.612492816099426
2545
+ - type: manhattan_spearman
2546
+ value: 58.65144067094258
2547
+ - task:
2548
+ type: STS
2549
+ dataset:
2550
+ type: mteb/sts22-crosslingual-sts
2551
+ name: MTEB STS22 (es-it)
2552
+ config: es-it
2553
+ split: test
2554
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2555
+ metrics:
2556
+ - type: cos_sim_pearson
2557
+ value: 49.13817835368122
2558
+ - type: cos_sim_spearman
2559
+ value: 50.78524216975442
2560
+ - type: euclidean_pearson
2561
+ value: 46.56046454501862
2562
+ - type: euclidean_spearman
2563
+ value: 50.3935060082369
2564
+ - type: manhattan_pearson
2565
+ value: 48.0232348418531
2566
+ - type: manhattan_spearman
2567
+ value: 50.79528358464199
2568
+ - task:
2569
+ type: STS
2570
+ dataset:
2571
+ type: mteb/sts22-crosslingual-sts
2572
+ name: MTEB STS22 (de-fr)
2573
+ config: de-fr
2574
+ split: test
2575
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2576
+ metrics:
2577
+ - type: cos_sim_pearson
2578
+ value: 44.274388638585286
2579
+ - type: cos_sim_spearman
2580
+ value: 49.43124017389838
2581
+ - type: euclidean_pearson
2582
+ value: 42.45909582681174
2583
+ - type: euclidean_spearman
2584
+ value: 49.661383797129055
2585
+ - type: manhattan_pearson
2586
+ value: 42.5771970142383
2587
+ - type: manhattan_spearman
2588
+ value: 50.14423414390715
2589
+ - task:
2590
+ type: STS
2591
+ dataset:
2592
+ type: mteb/sts22-crosslingual-sts
2593
+ name: MTEB STS22 (de-pl)
2594
+ config: de-pl
2595
+ split: test
2596
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2597
+ metrics:
2598
+ - type: cos_sim_pearson
2599
+ value: 26.119500839749776
2600
+ - type: cos_sim_spearman
2601
+ value: 39.324070169024424
2602
+ - type: euclidean_pearson
2603
+ value: 35.83247077201831
2604
+ - type: euclidean_spearman
2605
+ value: 42.61903924348457
2606
+ - type: manhattan_pearson
2607
+ value: 35.50415034487894
2608
+ - type: manhattan_spearman
2609
+ value: 41.87998075949351
2610
+ - task:
2611
+ type: STS
2612
+ dataset:
2613
+ type: mteb/sts22-crosslingual-sts
2614
+ name: MTEB STS22 (fr-pl)
2615
+ config: fr-pl
2616
+ split: test
2617
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2618
+ metrics:
2619
+ - type: cos_sim_pearson
2620
+ value: 72.62575835691209
2621
+ - type: cos_sim_spearman
2622
+ value: 73.24670207647144
2623
+ - type: euclidean_pearson
2624
+ value: 78.07793323914657
2625
+ - type: euclidean_spearman
2626
+ value: 73.24670207647144
2627
+ - type: manhattan_pearson
2628
+ value: 77.51429306378206
2629
+ - type: manhattan_spearman
2630
+ value: 73.24670207647144
2631
+ - task:
2632
+ type: STS
2633
+ dataset:
2634
+ type: mteb/stsbenchmark-sts
2635
+ name: MTEB STSBenchmark
2636
+ config: default
2637
+ split: test
2638
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2639
+ metrics:
2640
+ - type: cos_sim_pearson
2641
+ value: 84.09375596849891
2642
+ - type: cos_sim_spearman
2643
+ value: 86.44881302053585
2644
+ - type: euclidean_pearson
2645
+ value: 84.71259163967213
2646
+ - type: euclidean_spearman
2647
+ value: 85.63661992344069
2648
+ - type: manhattan_pearson
2649
+ value: 84.64466537502614
2650
+ - type: manhattan_spearman
2651
+ value: 85.53769949940238
2652
+ - task:
2653
+ type: Reranking
2654
+ dataset:
2655
+ type: mteb/scidocs-reranking
2656
+ name: MTEB SciDocsRR
2657
+ config: default
2658
+ split: test
2659
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2660
+ metrics:
2661
+ - type: map
2662
+ value: 70.2056154684549
2663
+ - type: mrr
2664
+ value: 89.52703161036494
2665
+ - task:
2666
+ type: PairClassification
2667
+ dataset:
2668
+ type: mteb/sprintduplicatequestions-pairclassification
2669
+ name: MTEB SprintDuplicateQuestions
2670
+ config: default
2671
+ split: test
2672
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2673
+ metrics:
2674
+ - type: cos_sim_accuracy
2675
+ value: 99.57623762376238
2676
+ - type: cos_sim_ap
2677
+ value: 83.53051588811371
2678
+ - type: cos_sim_f1
2679
+ value: 77.72704211060375
2680
+ - type: cos_sim_precision
2681
+ value: 78.88774459320288
2682
+ - type: cos_sim_recall
2683
+ value: 76.6
2684
+ - type: dot_accuracy
2685
+ value: 99.06435643564356
2686
+ - type: dot_ap
2687
+ value: 27.003124923857463
2688
+ - type: dot_f1
2689
+ value: 34.125269978401725
2690
+ - type: dot_precision
2691
+ value: 37.08920187793427
2692
+ - type: dot_recall
2693
+ value: 31.6
2694
+ - type: euclidean_accuracy
2695
+ value: 99.61485148514852
2696
+ - type: euclidean_ap
2697
+ value: 85.47332647001774
2698
+ - type: euclidean_f1
2699
+ value: 80.0808897876643
2700
+ - type: euclidean_precision
2701
+ value: 80.98159509202453
2702
+ - type: euclidean_recall
2703
+ value: 79.2
2704
+ - type: manhattan_accuracy
2705
+ value: 99.61683168316831
2706
+ - type: manhattan_ap
2707
+ value: 85.41969859598552
2708
+ - type: manhattan_f1
2709
+ value: 79.77755308392315
2710
+ - type: manhattan_precision
2711
+ value: 80.67484662576688
2712
+ - type: manhattan_recall
2713
+ value: 78.9
2714
+ - type: max_accuracy
2715
+ value: 99.61683168316831
2716
+ - type: max_ap
2717
+ value: 85.47332647001774
2718
+ - type: max_f1
2719
+ value: 80.0808897876643
2720
+ - task:
2721
+ type: Clustering
2722
+ dataset:
2723
+ type: mteb/stackexchange-clustering
2724
+ name: MTEB StackExchangeClustering
2725
+ config: default
2726
+ split: test
2727
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2728
+ metrics:
2729
+ - type: v_measure
2730
+ value: 34.35688940053467
2731
+ - task:
2732
+ type: Clustering
2733
+ dataset:
2734
+ type: mteb/stackexchange-clustering-p2p
2735
+ name: MTEB StackExchangeClusteringP2P
2736
+ config: default
2737
+ split: test
2738
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2739
+ metrics:
2740
+ - type: v_measure
2741
+ value: 30.64427069276576
2742
+ - task:
2743
+ type: Reranking
2744
+ dataset:
2745
+ type: mteb/stackoverflowdupquestions-reranking
2746
+ name: MTEB StackOverflowDupQuestions
2747
+ config: default
2748
+ split: test
2749
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2750
+ metrics:
2751
+ - type: map
2752
+ value: 44.89500754900078
2753
+ - type: mrr
2754
+ value: 45.33215558950853
2755
+ - task:
2756
+ type: Summarization
2757
+ dataset:
2758
+ type: mteb/summeval
2759
+ name: MTEB SummEval
2760
+ config: default
2761
+ split: test
2762
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2763
+ metrics:
2764
+ - type: cos_sim_pearson
2765
+ value: 30.653069624224084
2766
+ - type: cos_sim_spearman
2767
+ value: 30.10187112430319
2768
+ - type: dot_pearson
2769
+ value: 28.966278202103666
2770
+ - type: dot_spearman
2771
+ value: 28.342234095507767
2772
+ - task:
2773
+ type: Classification
2774
+ dataset:
2775
+ type: mteb/toxic_conversations_50k
2776
+ name: MTEB ToxicConversationsClassification
2777
+ config: default
2778
+ split: test
2779
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2780
+ metrics:
2781
+ - type: accuracy
2782
+ value: 65.96839999999999
2783
+ - type: ap
2784
+ value: 11.846327590186444
2785
+ - type: f1
2786
+ value: 50.518102944693574
2787
+ - task:
2788
+ type: Classification
2789
+ dataset:
2790
+ type: mteb/tweet_sentiment_extraction
2791
+ name: MTEB TweetSentimentExtractionClassification
2792
+ config: default
2793
+ split: test
2794
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2795
+ metrics:
2796
+ - type: accuracy
2797
+ value: 55.220713073005086
2798
+ - type: f1
2799
+ value: 55.47856175692088
2800
+ - task:
2801
+ type: Clustering
2802
+ dataset:
2803
+ type: mteb/twentynewsgroups-clustering
2804
+ name: MTEB TwentyNewsgroupsClustering
2805
+ config: default
2806
+ split: test
2807
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2808
+ metrics:
2809
+ - type: v_measure
2810
+ value: 31.581473892235877
2811
+ - task:
2812
+ type: PairClassification
2813
+ dataset:
2814
+ type: mteb/twittersemeval2015-pairclassification
2815
+ name: MTEB TwitterSemEval2015
2816
+ config: default
2817
+ split: test
2818
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2819
+ metrics:
2820
+ - type: cos_sim_accuracy
2821
+ value: 82.94093103653812
2822
+ - type: cos_sim_ap
2823
+ value: 62.48963249213361
2824
+ - type: cos_sim_f1
2825
+ value: 58.9541137429912
2826
+ - type: cos_sim_precision
2827
+ value: 52.05091937765205
2828
+ - type: cos_sim_recall
2829
+ value: 67.96833773087072
2830
+ - type: dot_accuracy
2831
+ value: 78.24998509864696
2832
+ - type: dot_ap
2833
+ value: 40.82371294480071
2834
+ - type: dot_f1
2835
+ value: 44.711163153786096
2836
+ - type: dot_precision
2837
+ value: 35.475379374419326
2838
+ - type: dot_recall
2839
+ value: 60.4485488126649
2840
+ - type: euclidean_accuracy
2841
+ value: 83.13166835548668
2842
+ - type: euclidean_ap
2843
+ value: 63.459878609769774
2844
+ - type: euclidean_f1
2845
+ value: 60.337199569532466
2846
+ - type: euclidean_precision
2847
+ value: 55.171659741963694
2848
+ - type: euclidean_recall
2849
+ value: 66.56992084432719
2850
+ - type: manhattan_accuracy
2851
+ value: 83.00649698992669
2852
+ - type: manhattan_ap
2853
+ value: 63.263161177904905
2854
+ - type: manhattan_f1
2855
+ value: 60.17122874713614
2856
+ - type: manhattan_precision
2857
+ value: 55.40750610703975
2858
+ - type: manhattan_recall
2859
+ value: 65.8311345646438
2860
+ - type: max_accuracy
2861
+ value: 83.13166835548668
2862
+ - type: max_ap
2863
+ value: 63.459878609769774
2864
+ - type: max_f1
2865
+ value: 60.337199569532466
2866
+ - task:
2867
+ type: PairClassification
2868
+ dataset:
2869
+ type: mteb/twitterurlcorpus-pairclassification
2870
+ name: MTEB TwitterURLCorpus
2871
+ config: default
2872
+ split: test
2873
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2874
+ metrics:
2875
+ - type: cos_sim_accuracy
2876
+ value: 87.80416812201653
2877
+ - type: cos_sim_ap
2878
+ value: 83.45540469219863
2879
+ - type: cos_sim_f1
2880
+ value: 75.58836427422892
2881
+ - type: cos_sim_precision
2882
+ value: 71.93934335002783
2883
+ - type: cos_sim_recall
2884
+ value: 79.62734832152756
2885
+ - type: dot_accuracy
2886
+ value: 83.04226336011176
2887
+ - type: dot_ap
2888
+ value: 70.63007268018524
2889
+ - type: dot_f1
2890
+ value: 65.35980325765405
2891
+ - type: dot_precision
2892
+ value: 60.84677151768532
2893
+ - type: dot_recall
2894
+ value: 70.59593470896212
2895
+ - type: euclidean_accuracy
2896
+ value: 87.60430007373773
2897
+ - type: euclidean_ap
2898
+ value: 83.10068502536592
2899
+ - type: euclidean_f1
2900
+ value: 75.02510506936439
2901
+ - type: euclidean_precision
2902
+ value: 72.56637168141593
2903
+ - type: euclidean_recall
2904
+ value: 77.65629812134279
2905
+ - type: manhattan_accuracy
2906
+ value: 87.60041914076145
2907
+ - type: manhattan_ap
2908
+ value: 83.05480769911229
2909
+ - type: manhattan_f1
2910
+ value: 74.98522895125554
2911
+ - type: manhattan_precision
2912
+ value: 72.04797047970479
2913
+ - type: manhattan_recall
2914
+ value: 78.17215891592238
2915
+ - type: max_accuracy
2916
+ value: 87.80416812201653
2917
+ - type: max_ap
2918
+ value: 83.45540469219863
2919
+ - type: max_f1
2920
+ value: 75.58836427422892
2921
+ ---
2922
+ # shibing624/text2vec-base-multilingual
2923
+ This is a CoSENT(Cosine Sentence) model: shibing624/text2vec-base-multilingual.
2924
+
2925
+ It maps sentences to a 384 dimensional dense vector space and can be used for tasks
2926
+ like sentence embeddings, text matching or semantic search.
2927
+
2928
+
2929
+
2930
+ - training dataset: https://huggingface.co/datasets/shibing624/nli-zh-all/tree/main/text2vec-base-multilingual-dataset
2931
+ - base model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
2932
+ - max_seq_length: 256
2933
+ - best epoch: 4
2934
+ - sentence embedding dim: 384
2935
+
2936
+ ## Evaluation
2937
+ For an automated evaluation of this model, see the *Evaluation Benchmark*: [text2vec](https://github.com/shibing624/text2vec)
2938
+ ## Languages
2939
+ Available languages are: de, en, es, fr, it, nl, pl, pt, ru, zh
2940
+
2941
+ ### Release Models
2942
+
2943
+ | Arch | BaseModel | Model | ATEC | BQ | LCQMC | PAWSX | STS-B | SOHU-dd | SOHU-dc | Avg | QPS |
2944
+ |:-----------|:-------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------|:-----:|:-----:|:-----:|:-----:|:-----:|:-------:|:-------:|:---------:|:-----:|
2945
+ | Word2Vec | word2vec | [w2v-light-tencent-chinese](https://ai.tencent.com/ailab/nlp/en/download.html) | 20.00 | 31.49 | 59.46 | 2.57 | 55.78 | 55.04 | 20.70 | 35.03 | 23769 |
2946
+ | SBERT | xlm-roberta-base | [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) | 18.42 | 38.52 | 63.96 | 10.14 | 78.90 | 63.01 | 52.28 | 46.46 | 3138 |
2947
+ | Instructor | hfl/chinese-roberta-wwm-ext | [moka-ai/m3e-base](https://huggingface.co/moka-ai/m3e-base) | 41.27 | 63.81 | 74.87 | 12.20 | 76.96 | 75.83 | 60.55 | 57.93 | 2980 |
2948
+ | CoSENT | hfl/chinese-macbert-base | [shibing624/text2vec-base-chinese](https://huggingface.co/shibing624/text2vec-base-chinese) | 31.93 | 42.67 | 70.16 | 17.21 | 79.30 | 70.27 | 50.42 | 51.61 | 3008 |
2949
+ | CoSENT | hfl/chinese-lert-large | [GanymedeNil/text2vec-large-chinese](https://huggingface.co/GanymedeNil/text2vec-large-chinese) | 32.61 | 44.59 | 69.30 | 14.51 | 79.44 | 73.01 | 59.04 | 53.12 | 2092 |
2950
+ | CoSENT | nghuyong/ernie-3.0-base-zh | [shibing624/text2vec-base-chinese-sentence](https://huggingface.co/shibing624/text2vec-base-chinese-sentence) | 43.37 | 61.43 | 73.48 | 38.90 | 78.25 | 70.60 | 53.08 | 59.87 | 3089 |
2951
+ | CoSENT | nghuyong/ernie-3.0-base-zh | [shibing624/text2vec-base-chinese-paraphrase](https://huggingface.co/shibing624/text2vec-base-chinese-paraphrase) | 44.89 | 63.58 | 74.24 | 40.90 | 78.93 | 76.70 | 63.30 | **63.08** | 3066 |
2952
+ | CoSENT | sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 | [shibing624/text2vec-base-multilingual](https://huggingface.co/shibing624/text2vec-base-multilingual) | 32.39 | 50.33 | 65.64 | 32.56 | 74.45 | 68.88 | 51.17 | 53.67 | 4004 |
2953
+
2954
+
2955
+ Illustrate:
2956
+ - Result evaluation index: spearman coefficient
2957
+ - The `shibing624/text2vec-base-chinese` model is trained using the CoSENT method. It is trained on Chinese STS-B data based on `hfl/chinese-macbert-base` and has achieved good results in the Chinese STS-B test set evaluation. , run [examples/training_sup_text_matching_model.py](https://github.com/shibing624/text2vec/blob/master/examples/training_sup_text_matching_model.py) code to train the model, the model file has been uploaded to HF model hub, Chinese universal semantic matching task Recommended Use
2958
+ - The `shibing624/text2vec-base-chinese-sentence` model is trained using the CoSENT method and is based on the manually selected Chinese STS data set of `nghuyong/ernie-3.0-base-zh` [shibing624/nli-zh-all/ text2vec-base-chinese-sentence-dataset](https://huggingface.co/datasets/shibing624/nli-zh-all/tree/main/text2vec-base-chinese-sentence-dataset), and is used in various Chinese NLI test set evaluation has achieved good results. Run the [examples/training_sup_text_matching_model_jsonl_data.py](https://github.com/shibing624/text2vec/blob/master/examples/training_sup_text_matching_model_jsonl_data.py) code to train the model, and the model file has been uploaded to HF model hub, recommended for Chinese s2s (sentence vs sentence) semantic matching tasks
2959
+ - The `shibing624/text2vec-base-chinese-paraphrase` model is trained using the CoSENT method and is based on the manually selected Chinese STS data set of `nghuyong/ernie-3.0-base-zh` [shibing624/nli-zh-all/ text2vec-base-chinese-paraphrase-dataset](https://huggingface.co/datasets/shibing624/nli-zh-all/tree/main/text2vec-base-chinese-paraphrase-dataset), the data set is relative to [shibing624 /nli-zh-all/text2vec-base-chinese-sentence-dataset](https://huggingface.co/datasets/shibing624/nli-zh-all/tree/main/text2vec-base-chinese-sentence-dataset) s2p (sentence to paraphrase) data was added to strengthen its long text representation capabilities, and the evaluation on each Chinese NLI test set reached SOTA, running [examples/training_sup_text_matching_model_jsonl_data.py](https://github.com/shibing624/text2vec /blob/master/examples/training_sup_text_matching_model_jsonl_data.py) code can train the model. The model file has been uploaded to HF model hub. It is recommended for Chinese s2p (sentence vs paragraph) semantic matching tasks.
2960
+ - The `shibing624/text2vec-base-multilingual` model is trained using the CoSENT method and is based on the manually selected multilingual STS data set of `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2` [shibing624/nli-zh -all/text2vec-base-multilingual-dataset](https://huggingface.co/datasets/shibing624/nli-zh-all/tree/main/text2vec-base-multilingual-dataset) trained and tested in Chinese and English The set evaluation effect is improved compared to the original model. Run the [examples/training_sup_text_matching_model_jsonl_data.py](https://github.com/shibing624/text2vec/blob/master/examples/training_sup_text_matching_model_jsonl_data.py) code to train the model, and the model file has been uploaded. HF model hub, recommended for multi-language semantic matching tasks
2961
+ - `w2v-light-tencent-chinese` is the Word2Vec model of Tencent word vector, which is loaded and used by CPU. It is suitable for Chinese text matching tasks and cold start situations where data is missing.
2962
+ - The GPU test environment of QPS is Tesla V100 with 32GB memory.
2963
+
2964
+ Model training experiment report: [Experiment report](https://github.com/shibing624/text2vec/blob/master/docs/model_report.md)
2965
+
2966
+ ## Usage (text2vec)
2967
+ Using this model becomes easy when you have [text2vec](https://github.com/shibing624/text2vec) installed:
2968
+
2969
+ ```
2970
+ pip install -U text2vec
2971
+ ```
2972
+
2973
+ Then you can use the model like this:
2974
+
2975
+ ```python
2976
+ from text2vec import SentenceModel
2977
+ sentences = ['如何更换花呗绑定银行卡', 'How to replace the Huabei bundled bank card']
2978
+
2979
+ model = SentenceModel('shibing624/text2vec-base-multilingual')
2980
+ embeddings = model.encode(sentences)
2981
+ print(embeddings)
2982
+ ```
2983
+
2984
+ ## Usage (HuggingFace Transformers)
2985
+ Without [text2vec](https://github.com/shibing624/text2vec), you can use the model like this:
2986
+
2987
+ First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
2988
+
2989
+ Install transformers:
2990
+ ```
2991
+ pip install transformers
2992
+ ```
2993
+
2994
+ Then load model and predict:
2995
+ ```python
2996
+ from transformers import AutoTokenizer, AutoModel
2997
+ import torch
2998
+
2999
+ # Mean Pooling - Take attention mask into account for correct averaging
3000
+ def mean_pooling(model_output, attention_mask):
3001
+ token_embeddings = model_output[0] # First element of model_output contains all token embeddings
3002
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
3003
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
3004
+
3005
+ # Load model from HuggingFace Hub
3006
+ tokenizer = AutoTokenizer.from_pretrained('shibing624/text2vec-base-multilingual')
3007
+ model = AutoModel.from_pretrained('shibing624/text2vec-base-multilingual')
3008
+ sentences = ['如何更换花呗绑定银行卡', 'How to replace the Huabei bundled bank card']
3009
+ # Tokenize sentences
3010
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
3011
+
3012
+ # Compute token embeddings
3013
+ with torch.no_grad():
3014
+ model_output = model(**encoded_input)
3015
+ # Perform pooling. In this case, mean pooling.
3016
+ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
3017
+ print("Sentence embeddings:")
3018
+ print(sentence_embeddings)
3019
+ ```
3020
+
3021
+ ## Usage (sentence-transformers)
3022
+ [sentence-transformers](https://github.com/UKPLab/sentence-transformers) is a popular library to compute dense vector representations for sentences.
3023
+
3024
+ Install sentence-transformers:
3025
+ ```
3026
+ pip install -U sentence-transformers
3027
+ ```
3028
+
3029
+ Then load model and predict:
3030
+
3031
+ ```python
3032
+ from sentence_transformers import SentenceTransformer
3033
+
3034
+ m = SentenceTransformer("shibing624/text2vec-base-multilingual")
3035
+ sentences = ['如何更换花呗绑定银行卡', 'How to replace the Huabei bundled bank card']
3036
+
3037
+ sentence_embeddings = m.encode(sentences)
3038
+ print("Sentence embeddings:")
3039
+ print(sentence_embeddings)
3040
+ ```
3041
+
3042
+
3043
+ ## Full Model Architecture
3044
+ ```
3045
+ CoSENT(
3046
+ (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
3047
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_mean_tokens': True})
3048
+ )
3049
+ ```
3050
+
3051
+
3052
+ ## Intended uses
3053
+
3054
+ Our model is intented to be used as a sentence and short paragraph encoder. Given an input text, it ouptuts a vector which captures
3055
+ the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks.
3056
+
3057
+ By default, input text longer than 256 word pieces is truncated.
3058
+
3059
+
3060
+ ## Training procedure
3061
+
3062
+ ### Pre-training
3063
+
3064
+ We use the pretrained [`sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) model.
3065
+ Please refer to the model card for more detailed information about the pre-training procedure.
3066
+
3067
+ ### Fine-tuning
3068
+
3069
+ We fine-tune the model using a contrastive objective. Formally, we compute the cosine similarity from each
3070
+ possible sentence pairs from the batch.
3071
+ We then apply the rank loss by comparing with true pairs and false pairs.
3072
+
3073
+
3074
+ ## Citing & Authors
3075
+ This model was trained by [text2vec](https://github.com/shibing624/text2vec).
3076
+
3077
+ If you find this model helpful, feel free to cite:
3078
+ ```bibtex
3079
+ @software{text2vec,
3080
+ author = {Ming Xu},
3081
+ title = {text2vec: A Tool for Text to Vector},
3082
+ year = {2023},
3083
+ url = {https://github.com/shibing624/text2vec},
3084
+ }
3085
+ ```
all.jsonl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5f84f8d815484ad61b099db424bcb751cb8b5027deff809f0b55fa2a17682363
3
+ size 31730006
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 384,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 1536,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_position_embeddings": 512,
16
+ "model_type": "bert",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "position_embedding_type": "absolute",
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.30.1",
23
+ "type_vocab_size": 2,
24
+ "use_cache": true,
25
+ "vocab_size": 250037
26
+ }
eval_results.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ eval_pearson = 0.7896593722697193
2
+ eval_spearman = 0.8097651989584397
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ed62ef4c21beacf8f38536f4b7822bb945151ab8dcae0138aec42074790606d
3
+ size 470686253
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 256,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b93bf61272f75c0a0b96b85fa262d2242e8a46008d76095386e98675f0bdd119
3
+ size 17082925
tokenizer_config.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "clean_up_tokenization_spaces": true,
4
+ "cls_token": "<s>",
5
+ "do_lower_case": true,
6
+ "eos_token": "</s>",
7
+ "mask_token": {
8
+ "__type": "AddedToken",
9
+ "content": "<mask>",
10
+ "lstrip": true,
11
+ "normalized": true,
12
+ "rstrip": false,
13
+ "single_word": false
14
+ },
15
+ "model_max_length": 512,
16
+ "pad_token": "<pad>",
17
+ "sep_token": "</s>",
18
+ "strip_accents": null,
19
+ "tokenize_chinese_chars": true,
20
+ "tokenizer_class": "BertTokenizer",
21
+ "unk_token": "<unk>"
22
+ }
unigram.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71b44701d7efd054205115acfa6ef126c5d2f84bd3affe0c59e48163674d19a6
3
+ size 14763234