liujiarik commited on
Commit
d60bede
1 Parent(s): 268222b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +335 -1
README.md CHANGED
@@ -1,5 +1,339 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
  ## Model Details
5
  Lim is a general text embedding model(chinese),We are continuously optimizing it.
@@ -23,7 +357,7 @@ model_name="liujiarik/lim_base_zh"
23
  from sentence_transformers import SentenceTransformer
24
  sentences = ['我换手机号了', '如果我换手机怎么办?']
25
 
26
- model = SentenceTransformer('{MODEL_NAME}')
27
  embeddings = model.encode(sentences)
28
  print(embeddings)
29
  ```
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - mteb
5
+ model-index:
6
+ - name: kim_base_zh_v0
7
+ results:
8
+ - task:
9
+ type: Classification
10
+ dataset:
11
+ type: mteb/amazon_reviews_multi
12
+ name: MTEB AmazonReviewsClassification (zh)
13
+ config: zh
14
+ split: test
15
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
16
+ metrics:
17
+ - type: accuracy
18
+ value: 46.66600000000001
19
+ - type: f1
20
+ value: 43.88121213919628
21
+ - task:
22
+ type: Clustering
23
+ dataset:
24
+ type: C-MTEB/CLSClusteringP2P
25
+ name: MTEB CLSClusteringP2P
26
+ config: default
27
+ split: test
28
+ revision: None
29
+ metrics:
30
+ - type: v_measure
31
+ value: 33.55469933811146
32
+ - task:
33
+ type: Clustering
34
+ dataset:
35
+ type: C-MTEB/CLSClusteringS2S
36
+ name: MTEB CLSClusteringS2S
37
+ config: default
38
+ split: test
39
+ revision: None
40
+ metrics:
41
+ - type: v_measure
42
+ value: 36.17977796122646
43
+ - task:
44
+ type: Reranking
45
+ dataset:
46
+ type: C-MTEB/CMedQAv1-reranking
47
+ name: MTEB CMedQAv1
48
+ config: default
49
+ split: test
50
+ revision: None
51
+ metrics:
52
+ - type: map
53
+ value: 83.84687250720238
54
+ - type: mrr
55
+ value: 86.34579365079364
56
+ - task:
57
+ type: Reranking
58
+ dataset:
59
+ type: C-MTEB/CMedQAv2-reranking
60
+ name: MTEB CMedQAv2
61
+ config: default
62
+ split: test
63
+ revision: None
64
+ metrics:
65
+ - type: map
66
+ value: 84.7457752094449
67
+ - type: mrr
68
+ value: 87.41591269841268
69
+ - task:
70
+ type: PairClassification
71
+ dataset:
72
+ type: C-MTEB/CMNLI
73
+ name: MTEB Cmnli
74
+ config: default
75
+ split: validation
76
+ revision: None
77
+ metrics:
78
+ - type: cos_sim_accuracy
79
+ value: 70.99218280216476
80
+ - type: cos_sim_ap
81
+ value: 79.5838273070596
82
+ - type: cos_sim_f1
83
+ value: 73.01215092730762
84
+ - type: cos_sim_precision
85
+ value: 67.09108716944172
86
+ - type: cos_sim_recall
87
+ value: 80.07949497311199
88
+ - type: dot_accuracy
89
+ value: 70.99218280216476
90
+ - type: dot_ap
91
+ value: 79.58744690895374
92
+ - type: dot_f1
93
+ value: 73.01215092730762
94
+ - type: dot_precision
95
+ value: 67.09108716944172
96
+ - type: dot_recall
97
+ value: 80.07949497311199
98
+ - type: euclidean_accuracy
99
+ value: 70.99218280216476
100
+ - type: euclidean_ap
101
+ value: 79.5838273070596
102
+ - type: euclidean_f1
103
+ value: 73.01215092730762
104
+ - type: euclidean_precision
105
+ value: 67.09108716944172
106
+ - type: euclidean_recall
107
+ value: 80.07949497311199
108
+ - type: manhattan_accuracy
109
+ value: 70.88394467829224
110
+ - type: manhattan_ap
111
+ value: 79.42301231718942
112
+ - type: manhattan_f1
113
+ value: 72.72536687631029
114
+ - type: manhattan_precision
115
+ value: 65.91297738932168
116
+ - type: manhattan_recall
117
+ value: 81.10825344867898
118
+ - type: max_accuracy
119
+ value: 70.99218280216476
120
+ - type: max_ap
121
+ value: 79.58744690895374
122
+ - type: max_f1
123
+ value: 73.01215092730762
124
+ - task:
125
+ type: Classification
126
+ dataset:
127
+ type: C-MTEB/IFlyTek-classification
128
+ name: MTEB IFlyTek
129
+ config: default
130
+ split: validation
131
+ revision: None
132
+ metrics:
133
+ - type: accuracy
134
+ value: 47.34128510965756
135
+ - type: f1
136
+ value: 35.49963469301016
137
+ - task:
138
+ type: Classification
139
+ dataset:
140
+ type: C-MTEB/JDReview-classification
141
+ name: MTEB JDReview
142
+ config: default
143
+ split: test
144
+ revision: None
145
+ metrics:
146
+ - type: accuracy
147
+ value: 85.66604127579738
148
+ - type: ap
149
+ value: 53.038152290755555
150
+ - type: f1
151
+ value: 80.14685686902159
152
+ - task:
153
+ type: Reranking
154
+ dataset:
155
+ type: C-MTEB/Mmarco-reranking
156
+ name: MTEB MMarcoReranking
157
+ config: default
158
+ split: dev
159
+ revision: None
160
+ metrics:
161
+ - type: map
162
+ value: 20.56449688140155
163
+ - type: mrr
164
+ value: 19.60753968253968
165
+ - task:
166
+ type: Classification
167
+ dataset:
168
+ type: mteb/amazon_massive_intent
169
+ name: MTEB MassiveIntentClassification (zh-CN)
170
+ config: zh-CN
171
+ split: test
172
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
173
+ metrics:
174
+ - type: accuracy
175
+ value: 72.38399462004035
176
+ - type: f1
177
+ value: 70.33023134666634
178
+ - task:
179
+ type: Classification
180
+ dataset:
181
+ type: mteb/amazon_massive_scenario
182
+ name: MTEB MassiveScenarioClassification (zh-CN)
183
+ config: zh-CN
184
+ split: test
185
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
186
+ metrics:
187
+ - type: accuracy
188
+ value: 74.87222595830531
189
+ - type: f1
190
+ value: 74.25722751562503
191
+ - task:
192
+ type: Classification
193
+ dataset:
194
+ type: C-MTEB/MultilingualSentiment-classification
195
+ name: MTEB MultilingualSentiment
196
+ config: default
197
+ split: validation
198
+ revision: None
199
+ metrics:
200
+ - type: accuracy
201
+ value: 76.27000000000001
202
+ - type: f1
203
+ value: 75.9660773461064
204
+ - task:
205
+ type: PairClassification
206
+ dataset:
207
+ type: C-MTEB/OCNLI
208
+ name: MTEB Ocnli
209
+ config: default
210
+ split: validation
211
+ revision: None
212
+ metrics:
213
+ - type: cos_sim_accuracy
214
+ value: 67.35246345425013
215
+ - type: cos_sim_ap
216
+ value: 69.69618171375657
217
+ - type: cos_sim_f1
218
+ value: 71.70665459483928
219
+ - type: cos_sim_precision
220
+ value: 62.75752773375595
221
+ - type: cos_sim_recall
222
+ value: 83.6325237592397
223
+ - type: dot_accuracy
224
+ value: 67.35246345425013
225
+ - type: dot_ap
226
+ value: 69.69618171375657
227
+ - type: dot_f1
228
+ value: 71.70665459483928
229
+ - type: dot_precision
230
+ value: 62.75752773375595
231
+ - type: dot_recall
232
+ value: 83.6325237592397
233
+ - type: euclidean_accuracy
234
+ value: 67.35246345425013
235
+ - type: euclidean_ap
236
+ value: 69.69618171375657
237
+ - type: euclidean_f1
238
+ value: 71.70665459483928
239
+ - type: euclidean_precision
240
+ value: 62.75752773375595
241
+ - type: euclidean_recall
242
+ value: 83.6325237592397
243
+ - type: manhattan_accuracy
244
+ value: 66.81104493773688
245
+ - type: manhattan_ap
246
+ value: 69.33781930832232
247
+ - type: manhattan_f1
248
+ value: 71.6342082980525
249
+ - type: manhattan_precision
250
+ value: 59.78798586572438
251
+ - type: manhattan_recall
252
+ value: 89.33474128827878
253
+ - type: max_accuracy
254
+ value: 67.35246345425013
255
+ - type: max_ap
256
+ value: 69.69618171375657
257
+ - type: max_f1
258
+ value: 71.70665459483928
259
+ - task:
260
+ type: Classification
261
+ dataset:
262
+ type: C-MTEB/OnlineShopping-classification
263
+ name: MTEB OnlineShopping
264
+ config: default
265
+ split: test
266
+ revision: None
267
+ metrics:
268
+ - type: accuracy
269
+ value: 93.05
270
+ - type: ap
271
+ value: 91.26069801777923
272
+ - type: f1
273
+ value: 93.04149818231389
274
+ - task:
275
+ type: Reranking
276
+ dataset:
277
+ type: C-MTEB/T2Reranking
278
+ name: MTEB T2Reranking
279
+ config: default
280
+ split: dev
281
+ revision: None
282
+ metrics:
283
+ - type: map
284
+ value: 65.74883739850293
285
+ - type: mrr
286
+ value: 75.47326869136282
287
+ - task:
288
+ type: Classification
289
+ dataset:
290
+ type: C-MTEB/TNews-classification
291
+ name: MTEB TNews
292
+ config: default
293
+ split: validation
294
+ revision: None
295
+ metrics:
296
+ - type: accuracy
297
+ value: 53.269999999999996
298
+ - type: f1
299
+ value: 51.410630382886445
300
+ - task:
301
+ type: Clustering
302
+ dataset:
303
+ type: C-MTEB/ThuNewsClusteringP2P
304
+ name: MTEB ThuNewsClusteringP2P
305
+ config: default
306
+ split: test
307
+ revision: None
308
+ metrics:
309
+ - type: v_measure
310
+ value: 63.344532225921434
311
+ - task:
312
+ type: Clustering
313
+ dataset:
314
+ type: C-MTEB/ThuNewsClusteringS2S
315
+ name: MTEB ThuNewsClusteringS2S
316
+ config: default
317
+ split: test
318
+ revision: None
319
+ metrics:
320
+ - type: v_measure
321
+ value: 60.33437882010517
322
+ - task:
323
+ type: Classification
324
+ dataset:
325
+ type: C-MTEB/waimai-classification
326
+ name: MTEB Waimai
327
+ config: default
328
+ split: test
329
+ revision: None
330
+ metrics:
331
+ - type: accuracy
332
+ value: 87.96000000000002
333
+ - type: ap
334
+ value: 72.43737061465443
335
+ - type: f1
336
+ value: 86.48668399738767
337
  ---
338
  ## Model Details
339
  Lim is a general text embedding model(chinese),We are continuously optimizing it.
 
357
  from sentence_transformers import SentenceTransformer
358
  sentences = ['我换手机号了', '如果我换手机怎么办?']
359
 
360
+ model = SentenceTransformer(model_name)
361
  embeddings = model.encode(sentences)
362
  print(embeddings)
363
  ```