Gameselo commited on
Commit
88edc46
1 Parent(s): 1366ac4

Add new SentenceTransformer model.

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1861 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: []
3
+ library_name: sentence-transformers
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - dataset_size:100K<n<1M
9
+ - loss:AnglELoss
10
+ base_model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2
11
+ metrics:
12
+ - pearson_cosine
13
+ - spearman_cosine
14
+ - pearson_manhattan
15
+ - spearman_manhattan
16
+ - pearson_euclidean
17
+ - spearman_euclidean
18
+ - pearson_dot
19
+ - spearman_dot
20
+ - pearson_max
21
+ - spearman_max
22
+ widget:
23
+ - source_sentence: 有些人在路上溜达。
24
+ sentences:
25
+ - Folk går
26
+ - Otururken gitar çalan adam.
27
+ - ארה"ב קבעה שסוריה השתמשה בנשק כימי
28
+ - source_sentence: 緬甸以前稱為緬甸。
29
+ sentences:
30
+ - 缅甸以前叫缅甸。
31
+ - This is very contradictory.
32
+ - 한 남자가 아기를 안고 의자에 앉아 잠들어 있다.
33
+ - source_sentence: אדם כותב.
34
+ sentences:
35
+ - האדם כותב.
36
+ - questa non è una risposta.
37
+ - 7 שוטרים נהרגו ו-4 שוטרים נפצעו.
38
+ - source_sentence: הם מפחדים.
39
+ sentences:
40
+ - liên quan đến rủi ro đáng kể;
41
+ - A man is playing a guitar.
42
+ - A man is playing a piano.
43
+ - source_sentence: 一个女人正在洗澡。
44
+ sentences:
45
+ - A woman is taking a bath.
46
+ - En jente børster håret sitt
47
+ - אדם מחלק תפוח אדמה.
48
+ pipeline_tag: sentence-similarity
49
+ model-index:
50
+ - name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2
51
+ results:
52
+ - task:
53
+ type: semantic-similarity
54
+ name: Semantic Similarity
55
+ dataset:
56
+ name: sts dev
57
+ type: sts-dev
58
+ metrics:
59
+ - type: pearson_cosine
60
+ value: 0.9551466915019567
61
+ name: Pearson Cosine
62
+ - type: spearman_cosine
63
+ value: 0.9592676437617756
64
+ name: Spearman Cosine
65
+ - type: pearson_manhattan
66
+ value: 0.9270103565661432
67
+ name: Pearson Manhattan
68
+ - type: spearman_manhattan
69
+ value: 0.9382925369644322
70
+ name: Spearman Manhattan
71
+ - type: pearson_euclidean
72
+ value: 0.9278315400036575
73
+ name: Pearson Euclidean
74
+ - type: spearman_euclidean
75
+ value: 0.9393641949848517
76
+ name: Spearman Euclidean
77
+ - type: pearson_dot
78
+ value: 0.8760113280718741
79
+ name: Pearson Dot
80
+ - type: spearman_dot
81
+ value: 0.8864509380027734
82
+ name: Spearman Dot
83
+ - type: pearson_max
84
+ value: 0.9551466915019567
85
+ name: Pearson Max
86
+ - type: spearman_max
87
+ value: 0.9592676437617756
88
+ name: Spearman Max
89
+ - task:
90
+ type: semantic-similarity
91
+ name: Semantic Similarity
92
+ dataset:
93
+ name: sts test
94
+ type: sts-test
95
+ metrics:
96
+ - type: pearson_cosine
97
+ value: 0.9479585032380113
98
+ name: Pearson Cosine
99
+ - type: spearman_cosine
100
+ value: 0.9514910354916427
101
+ name: Spearman Cosine
102
+ - type: pearson_manhattan
103
+ value: 0.925192141913064
104
+ name: Pearson Manhattan
105
+ - type: spearman_manhattan
106
+ value: 0.9351648026362221
107
+ name: Spearman Manhattan
108
+ - type: pearson_euclidean
109
+ value: 0.9258239806908134
110
+ name: Pearson Euclidean
111
+ - type: spearman_euclidean
112
+ value: 0.9363652577900217
113
+ name: Spearman Euclidean
114
+ - type: pearson_dot
115
+ value: 0.8442947652156254
116
+ name: Pearson Dot
117
+ - type: spearman_dot
118
+ value: 0.8435104766124126
119
+ name: Spearman Dot
120
+ - type: pearson_max
121
+ value: 0.9479585032380113
122
+ name: Pearson Max
123
+ - type: spearman_max
124
+ value: 0.9514910354916427
125
+ name: Spearman Max
126
+ - type: pearson_cosine
127
+ value: 0.9725274765440489
128
+ name: Pearson Cosine
129
+ - type: spearman_cosine
130
+ value: 0.9766335692570665
131
+ name: Spearman Cosine
132
+ - type: pearson_manhattan
133
+ value: 0.9382317294386867
134
+ name: Pearson Manhattan
135
+ - type: spearman_manhattan
136
+ value: 0.948654920505423
137
+ name: Spearman Manhattan
138
+ - type: pearson_euclidean
139
+ value: 0.9392057529290415
140
+ name: Pearson Euclidean
141
+ - type: spearman_euclidean
142
+ value: 0.9500099103637895
143
+ name: Spearman Euclidean
144
+ - type: pearson_dot
145
+ value: 0.8531236460319379
146
+ name: Pearson Dot
147
+ - type: spearman_dot
148
+ value: 0.8611492409185547
149
+ name: Spearman Dot
150
+ - type: pearson_max
151
+ value: 0.9725274765440489
152
+ name: Pearson Max
153
+ - type: spearman_max
154
+ value: 0.9766335692570665
155
+ name: Spearman Max
156
+ - type: pearson_cosine
157
+ value: 0.8026922386812214
158
+ name: Pearson Cosine
159
+ - type: spearman_cosine
160
+ value: 0.8124393788492182
161
+ name: Spearman Cosine
162
+ - type: pearson_manhattan
163
+ value: 0.7839394479918361
164
+ name: Pearson Manhattan
165
+ - type: spearman_manhattan
166
+ value: 0.7899571854314883
167
+ name: Spearman Manhattan
168
+ - type: pearson_euclidean
169
+ value: 0.7835912695413444
170
+ name: Pearson Euclidean
171
+ - type: spearman_euclidean
172
+ value: 0.7920219916708612
173
+ name: Spearman Euclidean
174
+ - type: pearson_dot
175
+ value: 0.7698701769634279
176
+ name: Pearson Dot
177
+ - type: spearman_dot
178
+ value: 0.781996122357711
179
+ name: Spearman Dot
180
+ - type: pearson_max
181
+ value: 0.8026922386812214
182
+ name: Pearson Max
183
+ - type: spearman_max
184
+ value: 0.8124393788492182
185
+ name: Spearman Max
186
+ - type: pearson_cosine
187
+ value: 0.7795928581740468
188
+ name: Pearson Cosine
189
+ - type: spearman_cosine
190
+ value: 0.7703365842088069
191
+ name: Spearman Cosine
192
+ - type: pearson_manhattan
193
+ value: 0.7903764226370217
194
+ name: Pearson Manhattan
195
+ - type: spearman_manhattan
196
+ value: 0.7829879213871844
197
+ name: Spearman Manhattan
198
+ - type: pearson_euclidean
199
+ value: 0.7911863454505806
200
+ name: Pearson Euclidean
201
+ - type: spearman_euclidean
202
+ value: 0.7841695636601043
203
+ name: Spearman Euclidean
204
+ - type: pearson_dot
205
+ value: 0.7077312955932407
206
+ name: Pearson Dot
207
+ - type: spearman_dot
208
+ value: 0.6914225616023565
209
+ name: Spearman Dot
210
+ - type: pearson_max
211
+ value: 0.7911863454505806
212
+ name: Pearson Max
213
+ - type: spearman_max
214
+ value: 0.7841695636601043
215
+ name: Spearman Max
216
+ - type: pearson_cosine
217
+ value: 0.9112700251605085
218
+ name: Pearson Cosine
219
+ - type: spearman_cosine
220
+ value: 0.9109414091487618
221
+ name: Spearman Cosine
222
+ - type: pearson_manhattan
223
+ value: 0.8969826303560867
224
+ name: Pearson Manhattan
225
+ - type: spearman_manhattan
226
+ value: 0.8934356058163047
227
+ name: Spearman Manhattan
228
+ - type: pearson_euclidean
229
+ value: 0.8986106629139636
230
+ name: Pearson Euclidean
231
+ - type: spearman_euclidean
232
+ value: 0.8954517657266873
233
+ name: Spearman Euclidean
234
+ - type: pearson_dot
235
+ value: 0.884386067267308
236
+ name: Pearson Dot
237
+ - type: spearman_dot
238
+ value: 0.8922685778872441
239
+ name: Spearman Dot
240
+ - type: pearson_max
241
+ value: 0.9112700251605085
242
+ name: Pearson Max
243
+ - type: spearman_max
244
+ value: 0.9109414091487618
245
+ name: Spearman Max
246
+ - type: pearson_cosine
247
+ value: 0.9361870787330656
248
+ name: Pearson Cosine
249
+ - type: spearman_cosine
250
+ value: 0.9378741534997558
251
+ name: Spearman Cosine
252
+ - type: pearson_manhattan
253
+ value: 0.9230051982649123
254
+ name: Pearson Manhattan
255
+ - type: spearman_manhattan
256
+ value: 0.9244721677465636
257
+ name: Spearman Manhattan
258
+ - type: pearson_euclidean
259
+ value: 0.9230904520135751
260
+ name: Pearson Euclidean
261
+ - type: spearman_euclidean
262
+ value: 0.9251248730902872
263
+ name: Spearman Euclidean
264
+ - type: pearson_dot
265
+ value: 0.9069963151228692
266
+ name: Pearson Dot
267
+ - type: spearman_dot
268
+ value: 0.9185797530151516
269
+ name: Spearman Dot
270
+ - type: pearson_max
271
+ value: 0.9361870787330656
272
+ name: Pearson Max
273
+ - type: spearman_max
274
+ value: 0.9378741534997558
275
+ name: Spearman Max
276
+ - type: pearson_cosine
277
+ value: 0.8048757108412675
278
+ name: Pearson Cosine
279
+ - type: spearman_cosine
280
+ value: 0.7987027653005363
281
+ name: Spearman Cosine
282
+ - type: pearson_manhattan
283
+ value: 0.8017660413612523
284
+ name: Pearson Manhattan
285
+ - type: spearman_manhattan
286
+ value: 0.7828168153285264
287
+ name: Spearman Manhattan
288
+ - type: pearson_euclidean
289
+ value: 0.8006665075585622
290
+ name: Pearson Euclidean
291
+ - type: spearman_euclidean
292
+ value: 0.7824761741785664
293
+ name: Spearman Euclidean
294
+ - type: pearson_dot
295
+ value: 0.7894710045147775
296
+ name: Pearson Dot
297
+ - type: spearman_dot
298
+ value: 0.7819409907917216
299
+ name: Spearman Dot
300
+ - type: pearson_max
301
+ value: 0.8048757108412675
302
+ name: Pearson Max
303
+ - type: spearman_max
304
+ value: 0.7987027653005363
305
+ name: Spearman Max
306
+ - type: pearson_cosine
307
+ value: 0.8520160385093393
308
+ name: Pearson Cosine
309
+ - type: spearman_cosine
310
+ value: 0.8553203530552356
311
+ name: Spearman Cosine
312
+ - type: pearson_manhattan
313
+ value: 0.8464006282913296
314
+ name: Pearson Manhattan
315
+ - type: spearman_manhattan
316
+ value: 0.8409514527398295
317
+ name: Spearman Manhattan
318
+ - type: pearson_euclidean
319
+ value: 0.8467543977447098
320
+ name: Pearson Euclidean
321
+ - type: spearman_euclidean
322
+ value: 0.8458591066828018
323
+ name: Spearman Euclidean
324
+ - type: pearson_dot
325
+ value: 0.8093136598158064
326
+ name: Pearson Dot
327
+ - type: spearman_dot
328
+ value: 0.8153571493902085
329
+ name: Spearman Dot
330
+ - type: pearson_max
331
+ value: 0.8520160385093393
332
+ name: Pearson Max
333
+ - type: spearman_max
334
+ value: 0.8553203530552356
335
+ name: Spearman Max
336
+ - type: pearson_cosine
337
+ value: 0.8751983236341568
338
+ name: Pearson Cosine
339
+ - type: spearman_cosine
340
+ value: 0.872701191632785
341
+ name: Spearman Cosine
342
+ - type: pearson_manhattan
343
+ value: 0.8744834146908832
344
+ name: Pearson Manhattan
345
+ - type: spearman_manhattan
346
+ value: 0.8661385734785878
347
+ name: Spearman Manhattan
348
+ - type: pearson_euclidean
349
+ value: 0.874802989814616
350
+ name: Pearson Euclidean
351
+ - type: spearman_euclidean
352
+ value: 0.8668384026485944
353
+ name: Spearman Euclidean
354
+ - type: pearson_dot
355
+ value: 0.8603441420083793
356
+ name: Pearson Dot
357
+ - type: spearman_dot
358
+ value: 0.8519571499551175
359
+ name: Spearman Dot
360
+ - type: pearson_max
361
+ value: 0.8751983236341568
362
+ name: Pearson Max
363
+ - type: spearman_max
364
+ value: 0.872701191632785
365
+ name: Spearman Max
366
+ - type: pearson_cosine
367
+ value: 0.9082404991830442
368
+ name: Pearson Cosine
369
+ - type: spearman_cosine
370
+ value: 0.9067607122592818
371
+ name: Spearman Cosine
372
+ - type: pearson_manhattan
373
+ value: 0.8908378724095692
374
+ name: Pearson Manhattan
375
+ - type: spearman_manhattan
376
+ value: 0.885184918244054
377
+ name: Spearman Manhattan
378
+ - type: pearson_euclidean
379
+ value: 0.8907567800603056
380
+ name: Pearson Euclidean
381
+ - type: spearman_euclidean
382
+ value: 0.8850799779856109
383
+ name: Spearman Euclidean
384
+ - type: pearson_dot
385
+ value: 0.8888621290344544
386
+ name: Pearson Dot
387
+ - type: spearman_dot
388
+ value: 0.8965880419316619
389
+ name: Spearman Dot
390
+ - type: pearson_max
391
+ value: 0.9082404991830442
392
+ name: Pearson Max
393
+ - type: spearman_max
394
+ value: 0.9067607122592818
395
+ name: Spearman Max
396
+ - type: pearson_cosine
397
+ value: 0.9249796814520836
398
+ name: Pearson Cosine
399
+ - type: spearman_cosine
400
+ value: 0.9246785886944904
401
+ name: Spearman Cosine
402
+ - type: pearson_manhattan
403
+ value: 0.9083667986520362
404
+ name: Pearson Manhattan
405
+ - type: spearman_manhattan
406
+ value: 0.90288714821411
407
+ name: Spearman Manhattan
408
+ - type: pearson_euclidean
409
+ value: 0.9115880396459031
410
+ name: Pearson Euclidean
411
+ - type: spearman_euclidean
412
+ value: 0.9083794061358542
413
+ name: Spearman Euclidean
414
+ - type: pearson_dot
415
+ value: 0.9000889923763985
416
+ name: Pearson Dot
417
+ - type: spearman_dot
418
+ value: 0.9070443969139744
419
+ name: Spearman Dot
420
+ - type: pearson_max
421
+ value: 0.9249796814520836
422
+ name: Pearson Max
423
+ - type: spearman_max
424
+ value: 0.9246785886944904
425
+ name: Spearman Max
426
+ - type: pearson_cosine
427
+ value: 0.9133091498737149
428
+ name: Pearson Cosine
429
+ - type: spearman_cosine
430
+ value: 0.9114826394926738
431
+ name: Spearman Cosine
432
+ - type: pearson_manhattan
433
+ value: 0.8977113793113364
434
+ name: Pearson Manhattan
435
+ - type: spearman_manhattan
436
+ value: 0.8933433506440468
437
+ name: Spearman Manhattan
438
+ - type: pearson_euclidean
439
+ value: 0.8979058595014344
440
+ name: Pearson Euclidean
441
+ - type: spearman_euclidean
442
+ value: 0.8937323599537337
443
+ name: Spearman Euclidean
444
+ - type: pearson_dot
445
+ value: 0.891219202934611
446
+ name: Pearson Dot
447
+ - type: spearman_dot
448
+ value: 0.8987764114969254
449
+ name: Spearman Dot
450
+ - type: pearson_max
451
+ value: 0.9133091498737149
452
+ name: Pearson Max
453
+ - type: spearman_max
454
+ value: 0.9114826394926738
455
+ name: Spearman Max
456
+ - type: pearson_cosine
457
+ value: 0.8984578585216539
458
+ name: Pearson Cosine
459
+ - type: spearman_cosine
460
+ value: 0.8451542547285167
461
+ name: Spearman Cosine
462
+ - type: pearson_manhattan
463
+ value: 0.8714879175346363
464
+ name: Pearson Manhattan
465
+ - type: spearman_manhattan
466
+ value: 0.8451542547285167
467
+ name: Spearman Manhattan
468
+ - type: pearson_euclidean
469
+ value: 0.8809190484217423
470
+ name: Pearson Euclidean
471
+ - type: spearman_euclidean
472
+ value: 0.8451542547285167
473
+ name: Spearman Euclidean
474
+ - type: pearson_dot
475
+ value: 0.8537957222589418
476
+ name: Pearson Dot
477
+ - type: spearman_dot
478
+ value: 0.8451542547285167
479
+ name: Spearman Dot
480
+ - type: pearson_max
481
+ value: 0.8984578585216539
482
+ name: Pearson Max
483
+ - type: spearman_max
484
+ value: 0.8451542547285167
485
+ name: Spearman Max
486
+ - type: pearson_cosine
487
+ value: 0.6494815112978085
488
+ name: Pearson Cosine
489
+ - type: spearman_cosine
490
+ value: 0.6385354535483773
491
+ name: Spearman Cosine
492
+ - type: pearson_manhattan
493
+ value: 0.6429493098908716
494
+ name: Pearson Manhattan
495
+ - type: spearman_manhattan
496
+ value: 0.6473666993823523
497
+ name: Spearman Manhattan
498
+ - type: pearson_euclidean
499
+ value: 0.6442945700268683
500
+ name: Pearson Euclidean
501
+ - type: spearman_euclidean
502
+ value: 0.6444758519763731
503
+ name: Spearman Euclidean
504
+ - type: pearson_dot
505
+ value: 0.6128358976757747
506
+ name: Pearson Dot
507
+ - type: spearman_dot
508
+ value: 0.6108258021881942
509
+ name: Spearman Dot
510
+ - type: pearson_max
511
+ value: 0.6494815112978085
512
+ name: Pearson Max
513
+ - type: spearman_max
514
+ value: 0.6473666993823523
515
+ name: Spearman Max
516
+ - type: pearson_cosine
517
+ value: 0.7441341150359049
518
+ name: Pearson Cosine
519
+ - type: spearman_cosine
520
+ value: 0.7518021273920814
521
+ name: Spearman Cosine
522
+ - type: pearson_manhattan
523
+ value: 0.7339108684091178
524
+ name: Pearson Manhattan
525
+ - type: spearman_manhattan
526
+ value: 0.7367402927783612
527
+ name: Spearman Manhattan
528
+ - type: pearson_euclidean
529
+ value: 0.7336764576613932
530
+ name: Pearson Euclidean
531
+ - type: spearman_euclidean
532
+ value: 0.734241088471987
533
+ name: Spearman Euclidean
534
+ - type: pearson_dot
535
+ value: 0.6886320720189693
536
+ name: Pearson Dot
537
+ - type: spearman_dot
538
+ value: 0.698561864698337
539
+ name: Spearman Dot
540
+ - type: pearson_max
541
+ value: 0.7441341150359049
542
+ name: Pearson Max
543
+ - type: spearman_max
544
+ value: 0.7518021273920814
545
+ name: Spearman Max
546
+ - type: pearson_cosine
547
+ value: 0.6278594754203957
548
+ name: Pearson Cosine
549
+ - type: spearman_cosine
550
+ value: 0.6319430830291571
551
+ name: Spearman Cosine
552
+ - type: pearson_manhattan
553
+ value: 0.543548091135791
554
+ name: Pearson Manhattan
555
+ - type: spearman_manhattan
556
+ value: 0.6002053211770223
557
+ name: Spearman Manhattan
558
+ - type: pearson_euclidean
559
+ value: 0.5399866615749636
560
+ name: Pearson Euclidean
561
+ - type: spearman_euclidean
562
+ value: 0.5955360076924765
563
+ name: Spearman Euclidean
564
+ - type: pearson_dot
565
+ value: 0.5657998544710718
566
+ name: Pearson Dot
567
+ - type: spearman_dot
568
+ value: 0.6068611192160528
569
+ name: Spearman Dot
570
+ - type: pearson_max
571
+ value: 0.6278594754203957
572
+ name: Pearson Max
573
+ - type: spearman_max
574
+ value: 0.6319430830291571
575
+ name: Spearman Max
576
+ - type: pearson_cosine
577
+ value: 0.7778538763931996
578
+ name: Pearson Cosine
579
+ - type: spearman_cosine
580
+ value: 0.7875616631597785
581
+ name: Spearman Cosine
582
+ - type: pearson_manhattan
583
+ value: 0.7425757616272681
584
+ name: Pearson Manhattan
585
+ - type: spearman_manhattan
586
+ value: 0.7789392103102715
587
+ name: Spearman Manhattan
588
+ - type: pearson_euclidean
589
+ value: 0.7437054735775576
590
+ name: Pearson Euclidean
591
+ - type: spearman_euclidean
592
+ value: 0.780583955651507
593
+ name: Spearman Euclidean
594
+ - type: pearson_dot
595
+ value: 0.7214423493083364
596
+ name: Pearson Dot
597
+ - type: spearman_dot
598
+ value: 0.7489073787091952
599
+ name: Spearman Dot
600
+ - type: pearson_max
601
+ value: 0.7778538763931996
602
+ name: Pearson Max
603
+ - type: spearman_max
604
+ value: 0.7875616631597785
605
+ name: Spearman Max
606
+ - type: pearson_cosine
607
+ value: 0.526790729806662
608
+ name: Pearson Cosine
609
+ - type: spearman_cosine
610
+ value: 0.5774252131250034
611
+ name: Spearman Cosine
612
+ - type: pearson_manhattan
613
+ value: 0.41713442172065224
614
+ name: Pearson Manhattan
615
+ - type: spearman_manhattan
616
+ value: 0.5599676717727231
617
+ name: Spearman Manhattan
618
+ - type: pearson_euclidean
619
+ value: 0.42192411421528214
620
+ name: Pearson Euclidean
621
+ - type: spearman_euclidean
622
+ value: 0.5665444422359257
623
+ name: Spearman Euclidean
624
+ - type: pearson_dot
625
+ value: 0.49809047501575476
626
+ name: Pearson Dot
627
+ - type: spearman_dot
628
+ value: 0.5367148143234142
629
+ name: Spearman Dot
630
+ - type: pearson_max
631
+ value: 0.526790729806662
632
+ name: Pearson Max
633
+ - type: spearman_max
634
+ value: 0.5774252131250034
635
+ name: Spearman Max
636
+ - type: pearson_cosine
637
+ value: 0.6306061651851392
638
+ name: Pearson Cosine
639
+ - type: spearman_cosine
640
+ value: 0.6383757017928495
641
+ name: Spearman Cosine
642
+ - type: pearson_manhattan
643
+ value: 0.603366556372183
644
+ name: Pearson Manhattan
645
+ - type: spearman_manhattan
646
+ value: 0.6167955278711116
647
+ name: Spearman Manhattan
648
+ - type: pearson_euclidean
649
+ value: 0.6081018686388112
650
+ name: Pearson Euclidean
651
+ - type: spearman_euclidean
652
+ value: 0.6219639110001453
653
+ name: Spearman Euclidean
654
+ - type: pearson_dot
655
+ value: 0.5767081284665276
656
+ name: Pearson Dot
657
+ - type: spearman_dot
658
+ value: 0.5831358067917275
659
+ name: Spearman Dot
660
+ - type: pearson_max
661
+ value: 0.6306061651851392
662
+ name: Pearson Max
663
+ - type: spearman_max
664
+ value: 0.6383757017928495
665
+ name: Spearman Max
666
+ - type: pearson_cosine
667
+ value: 0.5568482062575557
668
+ name: Pearson Cosine
669
+ - type: spearman_cosine
670
+ value: 0.5866853707548388
671
+ name: Spearman Cosine
672
+ - type: pearson_manhattan
673
+ value: 0.49244450938868833
674
+ name: Pearson Manhattan
675
+ - type: spearman_manhattan
676
+ value: 0.5737511662255662
677
+ name: Spearman Manhattan
678
+ - type: pearson_euclidean
679
+ value: 0.49058760093828624
680
+ name: Pearson Euclidean
681
+ - type: spearman_euclidean
682
+ value: 0.5762095703672849
683
+ name: Spearman Euclidean
684
+ - type: pearson_dot
685
+ value: 0.4306984514506903
686
+ name: Pearson Dot
687
+ - type: spearman_dot
688
+ value: 0.5470683854030187
689
+ name: Spearman Dot
690
+ - type: pearson_max
691
+ value: 0.5568482062575557
692
+ name: Pearson Max
693
+ - type: spearman_max
694
+ value: 0.5866853707548388
695
+ name: Spearman Max
696
+ - type: pearson_cosine
697
+ value: 0.5776222742798018
698
+ name: Pearson Cosine
699
+ - type: spearman_cosine
700
+ value: 0.5749790581441845
701
+ name: Spearman Cosine
702
+ - type: pearson_manhattan
703
+ value: 0.571787148920759
704
+ name: Pearson Manhattan
705
+ - type: spearman_manhattan
706
+ value: 0.5500811027014174
707
+ name: Spearman Manhattan
708
+ - type: pearson_euclidean
709
+ value: 0.5695499775959532
710
+ name: Pearson Euclidean
711
+ - type: spearman_euclidean
712
+ value: 0.5532223379017994
713
+ name: Spearman Euclidean
714
+ - type: pearson_dot
715
+ value: 0.53146407233978
716
+ name: Pearson Dot
717
+ - type: spearman_dot
718
+ value: 0.5190797374963447
719
+ name: Spearman Dot
720
+ - type: pearson_max
721
+ value: 0.5776222742798018
722
+ name: Pearson Max
723
+ - type: spearman_max
724
+ value: 0.5749790581441845
725
+ name: Spearman Max
726
+ - type: pearson_cosine
727
+ value: 0.3571900232473057
728
+ name: Pearson Cosine
729
+ - type: spearman_cosine
730
+ value: 0.4335552432730643
731
+ name: Spearman Cosine
732
+ - type: pearson_manhattan
733
+ value: 0.20808854264339055
734
+ name: Pearson Manhattan
735
+ - type: spearman_manhattan
736
+ value: 0.4354537154533896
737
+ name: Spearman Manhattan
738
+ - type: pearson_euclidean
739
+ value: 0.208616390027902
740
+ name: Pearson Euclidean
741
+ - type: spearman_euclidean
742
+ value: 0.440246452767669
743
+ name: Spearman Euclidean
744
+ - type: pearson_dot
745
+ value: 0.22336496195751424
746
+ name: Pearson Dot
747
+ - type: spearman_dot
748
+ value: 0.3706905558756734
749
+ name: Spearman Dot
750
+ - type: pearson_max
751
+ value: 0.3571900232473057
752
+ name: Pearson Max
753
+ - type: spearman_max
754
+ value: 0.440246452767669
755
+ name: Spearman Max
756
+ - type: pearson_cosine
757
+ value: 0.6863427356006826
758
+ name: Pearson Cosine
759
+ - type: spearman_cosine
760
+ value: 0.6620948502618977
761
+ name: Spearman Cosine
762
+ - type: pearson_manhattan
763
+ value: 0.6428578762643233
764
+ name: Pearson Manhattan
765
+ - type: spearman_manhattan
766
+ value: 0.6483663123081533
767
+ name: Spearman Manhattan
768
+ - type: pearson_euclidean
769
+ value: 0.6424050032110411
770
+ name: Pearson Euclidean
771
+ - type: spearman_euclidean
772
+ value: 0.6485902628925195
773
+ name: Spearman Euclidean
774
+ - type: pearson_dot
775
+ value: 0.6352371374824808
776
+ name: Pearson Dot
777
+ - type: spearman_dot
778
+ value: 0.6159110999161411
779
+ name: Spearman Dot
780
+ - type: pearson_max
781
+ value: 0.6863427356006826
782
+ name: Pearson Max
783
+ - type: spearman_max
784
+ value: 0.6620948502618977
785
+ name: Spearman Max
786
+ - type: pearson_cosine
787
+ value: 0.7570295008280781
788
+ name: Pearson Cosine
789
+ - type: spearman_cosine
790
+ value: 0.7510805416538202
791
+ name: Spearman Cosine
792
+ - type: pearson_manhattan
793
+ value: 0.7191097960855934
794
+ name: Pearson Manhattan
795
+ - type: spearman_manhattan
796
+ value: 0.7140422377894933
797
+ name: Spearman Manhattan
798
+ - type: pearson_euclidean
799
+ value: 0.7204228437397647
800
+ name: Pearson Euclidean
801
+ - type: spearman_euclidean
802
+ value: 0.7257632200250398
803
+ name: Spearman Euclidean
804
+ - type: pearson_dot
805
+ value: 0.7144336778935939
806
+ name: Pearson Dot
807
+ - type: spearman_dot
808
+ value: 0.7284199759984302
809
+ name: Spearman Dot
810
+ - type: pearson_max
811
+ value: 0.7570295008280781
812
+ name: Pearson Max
813
+ - type: spearman_max
814
+ value: 0.7510805416538202
815
+ name: Spearman Max
816
+ - type: pearson_cosine
817
+ value: 0.6502825737911098
818
+ name: Pearson Cosine
819
+ - type: spearman_cosine
820
+ value: 0.6624635951676386
821
+ name: Spearman Cosine
822
+ - type: pearson_manhattan
823
+ value: 0.647419285100459
824
+ name: Pearson Manhattan
825
+ - type: spearman_manhattan
826
+ value: 0.6589805549915764
827
+ name: Spearman Manhattan
828
+ - type: pearson_euclidean
829
+ value: 0.6516956762905051
830
+ name: Pearson Euclidean
831
+ - type: spearman_euclidean
832
+ value: 0.6667221229271868
833
+ name: Spearman Euclidean
834
+ - type: pearson_dot
835
+ value: 0.5646710115576599
836
+ name: Pearson Dot
837
+ - type: spearman_dot
838
+ value: 0.570198719868156
839
+ name: Spearman Dot
840
+ - type: pearson_max
841
+ value: 0.6516956762905051
842
+ name: Pearson Max
843
+ - type: spearman_max
844
+ value: 0.6667221229271868
845
+ name: Spearman Max
846
+ - type: pearson_cosine
847
+ value: 0.6774230420538705
848
+ name: Pearson Cosine
849
+ - type: spearman_cosine
850
+ value: 0.6537294853166558
851
+ name: Spearman Cosine
852
+ - type: pearson_manhattan
853
+ value: 0.6824702119604247
854
+ name: Pearson Manhattan
855
+ - type: spearman_manhattan
856
+ value: 0.6324707043840341
857
+ name: Spearman Manhattan
858
+ - type: pearson_euclidean
859
+ value: 0.6905615468119815
860
+ name: Pearson Euclidean
861
+ - type: spearman_euclidean
862
+ value: 0.640725065351179
863
+ name: Spearman Euclidean
864
+ - type: pearson_dot
865
+ value: 0.5834798827905125
866
+ name: Pearson Dot
867
+ - type: spearman_dot
868
+ value: 0.5962447037764929
869
+ name: Spearman Dot
870
+ - type: pearson_max
871
+ value: 0.6905615468119815
872
+ name: Pearson Max
873
+ - type: spearman_max
874
+ value: 0.6537294853166558
875
+ name: Spearman Max
876
+ - type: pearson_cosine
877
+ value: 0.6709478850576526
878
+ name: Pearson Cosine
879
+ - type: spearman_cosine
880
+ value: 0.6847049462613332
881
+ name: Spearman Cosine
882
+ - type: pearson_manhattan
883
+ value: 0.6612883666796053
884
+ name: Pearson Manhattan
885
+ - type: spearman_manhattan
886
+ value: 0.6906896123993531
887
+ name: Spearman Manhattan
888
+ - type: pearson_euclidean
889
+ value: 0.66070522554664
890
+ name: Pearson Euclidean
891
+ - type: spearman_euclidean
892
+ value: 0.6880796473119815
893
+ name: Spearman Euclidean
894
+ - type: pearson_dot
895
+ value: 0.609762034287328
896
+ name: Pearson Dot
897
+ - type: spearman_dot
898
+ value: 0.6194587632000961
899
+ name: Spearman Dot
900
+ - type: pearson_max
901
+ value: 0.6709478850576526
902
+ name: Pearson Max
903
+ - type: spearman_max
904
+ value: 0.6906896123993531
905
+ name: Spearman Max
906
+ - type: pearson_cosine
907
+ value: 0.5977420246846783
908
+ name: Pearson Cosine
909
+ - type: spearman_cosine
910
+ value: 0.5798716781400349
911
+ name: Spearman Cosine
912
+ - type: pearson_manhattan
913
+ value: 0.5974348978243684
914
+ name: Pearson Manhattan
915
+ - type: spearman_manhattan
916
+ value: 0.5952597125560467
917
+ name: Spearman Manhattan
918
+ - type: pearson_euclidean
919
+ value: 0.5949256850264925
920
+ name: Pearson Euclidean
921
+ - type: spearman_euclidean
922
+ value: 0.5935900431326085
923
+ name: Spearman Euclidean
924
+ - type: pearson_dot
925
+ value: 0.5042542872226021
926
+ name: Pearson Dot
927
+ - type: spearman_dot
928
+ value: 0.4968394689744579
929
+ name: Spearman Dot
930
+ - type: pearson_max
931
+ value: 0.5977420246846783
932
+ name: Pearson Max
933
+ - type: spearman_max
934
+ value: 0.5952597125560467
935
+ name: Spearman Max
936
+ - type: pearson_cosine
937
+ value: 0.45623521030042163
938
+ name: Pearson Cosine
939
+ - type: spearman_cosine
940
+ value: 0.44220332625465214
941
+ name: Spearman Cosine
942
+ - type: pearson_manhattan
943
+ value: 0.4154787596532877
944
+ name: Pearson Manhattan
945
+ - type: spearman_manhattan
946
+ value: 0.3836945296053597
947
+ name: Spearman Manhattan
948
+ - type: pearson_euclidean
949
+ value: 0.4111357738180186
950
+ name: Pearson Euclidean
951
+ - type: spearman_euclidean
952
+ value: 0.3821548244303783
953
+ name: Spearman Euclidean
954
+ - type: pearson_dot
955
+ value: 0.48625234725541483
956
+ name: Pearson Dot
957
+ - type: spearman_dot
958
+ value: 0.5302744622635869
959
+ name: Spearman Dot
960
+ - type: pearson_max
961
+ value: 0.48625234725541483
962
+ name: Pearson Max
963
+ - type: spearman_max
964
+ value: 0.5302744622635869
965
+ name: Spearman Max
966
+ - type: pearson_cosine
967
+ value: 0.5929570742517215
968
+ name: Pearson Cosine
969
+ - type: spearman_cosine
970
+ value: 0.6266361518449931
971
+ name: Spearman Cosine
972
+ - type: pearson_manhattan
973
+ value: 0.5608268850302591
974
+ name: Pearson Manhattan
975
+ - type: spearman_manhattan
976
+ value: 0.6228972623939251
977
+ name: Spearman Manhattan
978
+ - type: pearson_euclidean
979
+ value: 0.5579847474929831
980
+ name: Pearson Euclidean
981
+ - type: spearman_euclidean
982
+ value: 0.6202030126844109
983
+ name: Spearman Euclidean
984
+ - type: pearson_dot
985
+ value: 0.4578333834889949
986
+ name: Pearson Dot
987
+ - type: spearman_dot
988
+ value: 0.5628471668594075
989
+ name: Spearman Dot
990
+ - type: pearson_max
991
+ value: 0.5929570742517215
992
+ name: Pearson Max
993
+ - type: spearman_max
994
+ value: 0.6266361518449931
995
+ name: Spearman Max
996
+ ---
997
+
998
+ # SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2
999
+
1000
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
1001
+
1002
+ ## Model Details
1003
+
1004
+ ### Model Description
1005
+ - **Model Type:** Sentence Transformer
1006
+ - **Base model:** [sentence-transformers/paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) <!-- at revision 79f2382ceacceacdf38563d7c5d16b9ff8d725d6 -->
1007
+ - **Maximum Sequence Length:** 128 tokens
1008
+ - **Output Dimensionality:** 768 tokens
1009
+ - **Similarity Function:** Cosine Similarity
1010
+ <!-- - **Training Dataset:** Unknown -->
1011
+ <!-- - **Language:** Unknown -->
1012
+ <!-- - **License:** Unknown -->
1013
+
1014
+ ### Model Sources
1015
+
1016
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
1017
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
1018
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
1019
+
1020
+ ### Full Model Architecture
1021
+
1022
+ ```
1023
+ SentenceTransformer(
1024
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
1025
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
1026
+ )
1027
+ ```
1028
+
1029
+ ## Usage
1030
+
1031
+ ### Direct Usage (Sentence Transformers)
1032
+
1033
+ First install the Sentence Transformers library:
1034
+
1035
+ ```bash
1036
+ pip install -U sentence-transformers
1037
+ ```
1038
+
1039
+ Then you can load this model and run inference.
1040
+ ```python
1041
+ from sentence_transformers import SentenceTransformer
1042
+
1043
+ # Download from the 🤗 Hub
1044
+ model = SentenceTransformer("Gameselo/STS-multilingual-mpnet-base-v2")
1045
+ # Run inference
1046
+ sentences = [
1047
+ '一个女人正在洗澡。',
1048
+ 'A woman is taking a bath.',
1049
+ 'En jente børster håret sitt',
1050
+ ]
1051
+ embeddings = model.encode(sentences)
1052
+ print(embeddings.shape)
1053
+ # [3, 768]
1054
+
1055
+ # Get the similarity scores for the embeddings
1056
+ similarities = model.similarity(embeddings, embeddings)
1057
+ print(similarities.shape)
1058
+ # [3, 3]
1059
+ ```
1060
+
1061
+ <!--
1062
+ ### Direct Usage (Transformers)
1063
+
1064
+ <details><summary>Click to see the direct usage in Transformers</summary>
1065
+
1066
+ </details>
1067
+ -->
1068
+
1069
+ <!--
1070
+ ### Downstream Usage (Sentence Transformers)
1071
+
1072
+ You can finetune this model on your own dataset.
1073
+
1074
+ <details><summary>Click to expand</summary>
1075
+
1076
+ </details>
1077
+ -->
1078
+
1079
+ <!--
1080
+ ### Out-of-Scope Use
1081
+
1082
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
1083
+ -->
1084
+
1085
+ ## Evaluation
1086
+
1087
+ ### Metrics
1088
+
1089
+ #### Semantic Similarity
1090
+ * Dataset: `sts-dev`
1091
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1092
+
1093
+ | Metric | Value |
1094
+ |:--------------------|:-----------|
1095
+ | pearson_cosine | 0.9551 |
1096
+ | **spearman_cosine** | **0.9593** |
1097
+ | pearson_manhattan | 0.927 |
1098
+ | spearman_manhattan | 0.9383 |
1099
+ | pearson_euclidean | 0.9278 |
1100
+ | spearman_euclidean | 0.9394 |
1101
+ | pearson_dot | 0.876 |
1102
+ | spearman_dot | 0.8865 |
1103
+ | pearson_max | 0.9551 |
1104
+ | spearman_max | 0.9593 |
1105
+
1106
+ #### Semantic Similarity
1107
+ * Dataset: `sts-test`
1108
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1109
+
1110
+ | Metric | Value |
1111
+ |:--------------------|:-----------|
1112
+ | pearson_cosine | 0.948 |
1113
+ | **spearman_cosine** | **0.9515** |
1114
+ | pearson_manhattan | 0.9252 |
1115
+ | spearman_manhattan | 0.9352 |
1116
+ | pearson_euclidean | 0.9258 |
1117
+ | spearman_euclidean | 0.9364 |
1118
+ | pearson_dot | 0.8443 |
1119
+ | spearman_dot | 0.8435 |
1120
+ | pearson_max | 0.948 |
1121
+ | spearman_max | 0.9515 |
1122
+
1123
+ #### Semantic Similarity
1124
+ * Dataset: `sts-test`
1125
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1126
+
1127
+ | Metric | Value |
1128
+ |:--------------------|:-----------|
1129
+ | pearson_cosine | 0.9725 |
1130
+ | **spearman_cosine** | **0.9766** |
1131
+ | pearson_manhattan | 0.9382 |
1132
+ | spearman_manhattan | 0.9487 |
1133
+ | pearson_euclidean | 0.9392 |
1134
+ | spearman_euclidean | 0.95 |
1135
+ | pearson_dot | 0.8531 |
1136
+ | spearman_dot | 0.8611 |
1137
+ | pearson_max | 0.9725 |
1138
+ | spearman_max | 0.9766 |
1139
+
1140
+ #### Semantic Similarity
1141
+ * Dataset: `sts-test`
1142
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1143
+
1144
+ | Metric | Value |
1145
+ |:--------------------|:-----------|
1146
+ | pearson_cosine | 0.8027 |
1147
+ | **spearman_cosine** | **0.8124** |
1148
+ | pearson_manhattan | 0.7839 |
1149
+ | spearman_manhattan | 0.79 |
1150
+ | pearson_euclidean | 0.7836 |
1151
+ | spearman_euclidean | 0.792 |
1152
+ | pearson_dot | 0.7699 |
1153
+ | spearman_dot | 0.782 |
1154
+ | pearson_max | 0.8027 |
1155
+ | spearman_max | 0.8124 |
1156
+
1157
+ #### Semantic Similarity
1158
+ * Dataset: `sts-test`
1159
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1160
+
1161
+ | Metric | Value |
1162
+ |:--------------------|:-----------|
1163
+ | pearson_cosine | 0.7796 |
1164
+ | **spearman_cosine** | **0.7703** |
1165
+ | pearson_manhattan | 0.7904 |
1166
+ | spearman_manhattan | 0.783 |
1167
+ | pearson_euclidean | 0.7912 |
1168
+ | spearman_euclidean | 0.7842 |
1169
+ | pearson_dot | 0.7077 |
1170
+ | spearman_dot | 0.6914 |
1171
+ | pearson_max | 0.7912 |
1172
+ | spearman_max | 0.7842 |
1173
+
1174
+ #### Semantic Similarity
1175
+ * Dataset: `sts-test`
1176
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1177
+
1178
+ | Metric | Value |
1179
+ |:--------------------|:-----------|
1180
+ | pearson_cosine | 0.9113 |
1181
+ | **spearman_cosine** | **0.9109** |
1182
+ | pearson_manhattan | 0.897 |
1183
+ | spearman_manhattan | 0.8934 |
1184
+ | pearson_euclidean | 0.8986 |
1185
+ | spearman_euclidean | 0.8955 |
1186
+ | pearson_dot | 0.8844 |
1187
+ | spearman_dot | 0.8923 |
1188
+ | pearson_max | 0.9113 |
1189
+ | spearman_max | 0.9109 |
1190
+
1191
+ #### Semantic Similarity
1192
+ * Dataset: `sts-test`
1193
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1194
+
1195
+ | Metric | Value |
1196
+ |:--------------------|:-----------|
1197
+ | pearson_cosine | 0.9362 |
1198
+ | **spearman_cosine** | **0.9379** |
1199
+ | pearson_manhattan | 0.923 |
1200
+ | spearman_manhattan | 0.9245 |
1201
+ | pearson_euclidean | 0.9231 |
1202
+ | spearman_euclidean | 0.9251 |
1203
+ | pearson_dot | 0.907 |
1204
+ | spearman_dot | 0.9186 |
1205
+ | pearson_max | 0.9362 |
1206
+ | spearman_max | 0.9379 |
1207
+
1208
+ #### Semantic Similarity
1209
+ * Dataset: `sts-test`
1210
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1211
+
1212
+ | Metric | Value |
1213
+ |:--------------------|:-----------|
1214
+ | pearson_cosine | 0.8049 |
1215
+ | **spearman_cosine** | **0.7987** |
1216
+ | pearson_manhattan | 0.8018 |
1217
+ | spearman_manhattan | 0.7828 |
1218
+ | pearson_euclidean | 0.8007 |
1219
+ | spearman_euclidean | 0.7825 |
1220
+ | pearson_dot | 0.7895 |
1221
+ | spearman_dot | 0.7819 |
1222
+ | pearson_max | 0.8049 |
1223
+ | spearman_max | 0.7987 |
1224
+
1225
+ #### Semantic Similarity
1226
+ * Dataset: `sts-test`
1227
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1228
+
1229
+ | Metric | Value |
1230
+ |:--------------------|:-----------|
1231
+ | pearson_cosine | 0.852 |
1232
+ | **spearman_cosine** | **0.8553** |
1233
+ | pearson_manhattan | 0.8464 |
1234
+ | spearman_manhattan | 0.841 |
1235
+ | pearson_euclidean | 0.8468 |
1236
+ | spearman_euclidean | 0.8459 |
1237
+ | pearson_dot | 0.8093 |
1238
+ | spearman_dot | 0.8154 |
1239
+ | pearson_max | 0.852 |
1240
+ | spearman_max | 0.8553 |
1241
+
1242
+ #### Semantic Similarity
1243
+ * Dataset: `sts-test`
1244
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1245
+
1246
+ | Metric | Value |
1247
+ |:--------------------|:-----------|
1248
+ | pearson_cosine | 0.8752 |
1249
+ | **spearman_cosine** | **0.8727** |
1250
+ | pearson_manhattan | 0.8745 |
1251
+ | spearman_manhattan | 0.8661 |
1252
+ | pearson_euclidean | 0.8748 |
1253
+ | spearman_euclidean | 0.8668 |
1254
+ | pearson_dot | 0.8603 |
1255
+ | spearman_dot | 0.852 |
1256
+ | pearson_max | 0.8752 |
1257
+ | spearman_max | 0.8727 |
1258
+
1259
+ #### Semantic Similarity
1260
+ * Dataset: `sts-test`
1261
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1262
+
1263
+ | Metric | Value |
1264
+ |:--------------------|:-----------|
1265
+ | pearson_cosine | 0.9082 |
1266
+ | **spearman_cosine** | **0.9068** |
1267
+ | pearson_manhattan | 0.8908 |
1268
+ | spearman_manhattan | 0.8852 |
1269
+ | pearson_euclidean | 0.8908 |
1270
+ | spearman_euclidean | 0.8851 |
1271
+ | pearson_dot | 0.8889 |
1272
+ | spearman_dot | 0.8966 |
1273
+ | pearson_max | 0.9082 |
1274
+ | spearman_max | 0.9068 |
1275
+
1276
+ #### Semantic Similarity
1277
+ * Dataset: `sts-test`
1278
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1279
+
1280
+ | Metric | Value |
1281
+ |:--------------------|:-----------|
1282
+ | pearson_cosine | 0.925 |
1283
+ | **spearman_cosine** | **0.9247** |
1284
+ | pearson_manhattan | 0.9084 |
1285
+ | spearman_manhattan | 0.9029 |
1286
+ | pearson_euclidean | 0.9116 |
1287
+ | spearman_euclidean | 0.9084 |
1288
+ | pearson_dot | 0.9001 |
1289
+ | spearman_dot | 0.907 |
1290
+ | pearson_max | 0.925 |
1291
+ | spearman_max | 0.9247 |
1292
+
1293
+ #### Semantic Similarity
1294
+ * Dataset: `sts-test`
1295
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1296
+
1297
+ | Metric | Value |
1298
+ |:--------------------|:-----------|
1299
+ | pearson_cosine | 0.9133 |
1300
+ | **spearman_cosine** | **0.9115** |
1301
+ | pearson_manhattan | 0.8977 |
1302
+ | spearman_manhattan | 0.8933 |
1303
+ | pearson_euclidean | 0.8979 |
1304
+ | spearman_euclidean | 0.8937 |
1305
+ | pearson_dot | 0.8912 |
1306
+ | spearman_dot | 0.8988 |
1307
+ | pearson_max | 0.9133 |
1308
+ | spearman_max | 0.9115 |
1309
+
1310
+ #### Semantic Similarity
1311
+ * Dataset: `sts-test`
1312
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1313
+
1314
+ | Metric | Value |
1315
+ |:--------------------|:-----------|
1316
+ | pearson_cosine | 0.8985 |
1317
+ | **spearman_cosine** | **0.8452** |
1318
+ | pearson_manhattan | 0.8715 |
1319
+ | spearman_manhattan | 0.8452 |
1320
+ | pearson_euclidean | 0.8809 |
1321
+ | spearman_euclidean | 0.8452 |
1322
+ | pearson_dot | 0.8538 |
1323
+ | spearman_dot | 0.8452 |
1324
+ | pearson_max | 0.8985 |
1325
+ | spearman_max | 0.8452 |
1326
+
1327
+ #### Semantic Similarity
1328
+ * Dataset: `sts-test`
1329
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1330
+
1331
+ | Metric | Value |
1332
+ |:--------------------|:-----------|
1333
+ | pearson_cosine | 0.6495 |
1334
+ | **spearman_cosine** | **0.6385** |
1335
+ | pearson_manhattan | 0.6429 |
1336
+ | spearman_manhattan | 0.6474 |
1337
+ | pearson_euclidean | 0.6443 |
1338
+ | spearman_euclidean | 0.6445 |
1339
+ | pearson_dot | 0.6128 |
1340
+ | spearman_dot | 0.6108 |
1341
+ | pearson_max | 0.6495 |
1342
+ | spearman_max | 0.6474 |
1343
+
1344
+ #### Semantic Similarity
1345
+ * Dataset: `sts-test`
1346
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1347
+
1348
+ | Metric | Value |
1349
+ |:--------------------|:-----------|
1350
+ | pearson_cosine | 0.7441 |
1351
+ | **spearman_cosine** | **0.7518** |
1352
+ | pearson_manhattan | 0.7339 |
1353
+ | spearman_manhattan | 0.7367 |
1354
+ | pearson_euclidean | 0.7337 |
1355
+ | spearman_euclidean | 0.7342 |
1356
+ | pearson_dot | 0.6886 |
1357
+ | spearman_dot | 0.6986 |
1358
+ | pearson_max | 0.7441 |
1359
+ | spearman_max | 0.7518 |
1360
+
1361
+ #### Semantic Similarity
1362
+ * Dataset: `sts-test`
1363
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1364
+
1365
+ | Metric | Value |
1366
+ |:--------------------|:-----------|
1367
+ | pearson_cosine | 0.6279 |
1368
+ | **spearman_cosine** | **0.6319** |
1369
+ | pearson_manhattan | 0.5435 |
1370
+ | spearman_manhattan | 0.6002 |
1371
+ | pearson_euclidean | 0.54 |
1372
+ | spearman_euclidean | 0.5955 |
1373
+ | pearson_dot | 0.5658 |
1374
+ | spearman_dot | 0.6069 |
1375
+ | pearson_max | 0.6279 |
1376
+ | spearman_max | 0.6319 |
1377
+
1378
+ #### Semantic Similarity
1379
+ * Dataset: `sts-test`
1380
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1381
+
1382
+ | Metric | Value |
1383
+ |:--------------------|:-----------|
1384
+ | pearson_cosine | 0.7779 |
1385
+ | **spearman_cosine** | **0.7876** |
1386
+ | pearson_manhattan | 0.7426 |
1387
+ | spearman_manhattan | 0.7789 |
1388
+ | pearson_euclidean | 0.7437 |
1389
+ | spearman_euclidean | 0.7806 |
1390
+ | pearson_dot | 0.7214 |
1391
+ | spearman_dot | 0.7489 |
1392
+ | pearson_max | 0.7779 |
1393
+ | spearman_max | 0.7876 |
1394
+
1395
+ #### Semantic Similarity
1396
+ * Dataset: `sts-test`
1397
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1398
+
1399
+ | Metric | Value |
1400
+ |:--------------------|:-----------|
1401
+ | pearson_cosine | 0.5268 |
1402
+ | **spearman_cosine** | **0.5774** |
1403
+ | pearson_manhattan | 0.4171 |
1404
+ | spearman_manhattan | 0.56 |
1405
+ | pearson_euclidean | 0.4219 |
1406
+ | spearman_euclidean | 0.5665 |
1407
+ | pearson_dot | 0.4981 |
1408
+ | spearman_dot | 0.5367 |
1409
+ | pearson_max | 0.5268 |
1410
+ | spearman_max | 0.5774 |
1411
+
1412
+ #### Semantic Similarity
1413
+ * Dataset: `sts-test`
1414
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1415
+
1416
+ | Metric | Value |
1417
+ |:--------------------|:-----------|
1418
+ | pearson_cosine | 0.6306 |
1419
+ | **spearman_cosine** | **0.6384** |
1420
+ | pearson_manhattan | 0.6034 |
1421
+ | spearman_manhattan | 0.6168 |
1422
+ | pearson_euclidean | 0.6081 |
1423
+ | spearman_euclidean | 0.622 |
1424
+ | pearson_dot | 0.5767 |
1425
+ | spearman_dot | 0.5831 |
1426
+ | pearson_max | 0.6306 |
1427
+ | spearman_max | 0.6384 |
1428
+
1429
+ #### Semantic Similarity
1430
+ * Dataset: `sts-test`
1431
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1432
+
1433
+ | Metric | Value |
1434
+ |:--------------------|:-----------|
1435
+ | pearson_cosine | 0.5568 |
1436
+ | **spearman_cosine** | **0.5867** |
1437
+ | pearson_manhattan | 0.4924 |
1438
+ | spearman_manhattan | 0.5738 |
1439
+ | pearson_euclidean | 0.4906 |
1440
+ | spearman_euclidean | 0.5762 |
1441
+ | pearson_dot | 0.4307 |
1442
+ | spearman_dot | 0.5471 |
1443
+ | pearson_max | 0.5568 |
1444
+ | spearman_max | 0.5867 |
1445
+
1446
+ #### Semantic Similarity
1447
+ * Dataset: `sts-test`
1448
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1449
+
1450
+ | Metric | Value |
1451
+ |:--------------------|:----------|
1452
+ | pearson_cosine | 0.5776 |
1453
+ | **spearman_cosine** | **0.575** |
1454
+ | pearson_manhattan | 0.5718 |
1455
+ | spearman_manhattan | 0.5501 |
1456
+ | pearson_euclidean | 0.5695 |
1457
+ | spearman_euclidean | 0.5532 |
1458
+ | pearson_dot | 0.5315 |
1459
+ | spearman_dot | 0.5191 |
1460
+ | pearson_max | 0.5776 |
1461
+ | spearman_max | 0.575 |
1462
+
1463
+ #### Semantic Similarity
1464
+ * Dataset: `sts-test`
1465
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1466
+
1467
+ | Metric | Value |
1468
+ |:--------------------|:-----------|
1469
+ | pearson_cosine | 0.3572 |
1470
+ | **spearman_cosine** | **0.4336** |
1471
+ | pearson_manhattan | 0.2081 |
1472
+ | spearman_manhattan | 0.4355 |
1473
+ | pearson_euclidean | 0.2086 |
1474
+ | spearman_euclidean | 0.4402 |
1475
+ | pearson_dot | 0.2234 |
1476
+ | spearman_dot | 0.3707 |
1477
+ | pearson_max | 0.3572 |
1478
+ | spearman_max | 0.4402 |
1479
+
1480
+ #### Semantic Similarity
1481
+ * Dataset: `sts-test`
1482
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1483
+
1484
+ | Metric | Value |
1485
+ |:--------------------|:-----------|
1486
+ | pearson_cosine | 0.6863 |
1487
+ | **spearman_cosine** | **0.6621** |
1488
+ | pearson_manhattan | 0.6429 |
1489
+ | spearman_manhattan | 0.6484 |
1490
+ | pearson_euclidean | 0.6424 |
1491
+ | spearman_euclidean | 0.6486 |
1492
+ | pearson_dot | 0.6352 |
1493
+ | spearman_dot | 0.6159 |
1494
+ | pearson_max | 0.6863 |
1495
+ | spearman_max | 0.6621 |
1496
+
1497
+ #### Semantic Similarity
1498
+ * Dataset: `sts-test`
1499
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1500
+
1501
+ | Metric | Value |
1502
+ |:--------------------|:-----------|
1503
+ | pearson_cosine | 0.757 |
1504
+ | **spearman_cosine** | **0.7511** |
1505
+ | pearson_manhattan | 0.7191 |
1506
+ | spearman_manhattan | 0.714 |
1507
+ | pearson_euclidean | 0.7204 |
1508
+ | spearman_euclidean | 0.7258 |
1509
+ | pearson_dot | 0.7144 |
1510
+ | spearman_dot | 0.7284 |
1511
+ | pearson_max | 0.757 |
1512
+ | spearman_max | 0.7511 |
1513
+
1514
+ #### Semantic Similarity
1515
+ * Dataset: `sts-test`
1516
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1517
+
1518
+ | Metric | Value |
1519
+ |:--------------------|:-----------|
1520
+ | pearson_cosine | 0.6503 |
1521
+ | **spearman_cosine** | **0.6625** |
1522
+ | pearson_manhattan | 0.6474 |
1523
+ | spearman_manhattan | 0.659 |
1524
+ | pearson_euclidean | 0.6517 |
1525
+ | spearman_euclidean | 0.6667 |
1526
+ | pearson_dot | 0.5647 |
1527
+ | spearman_dot | 0.5702 |
1528
+ | pearson_max | 0.6517 |
1529
+ | spearman_max | 0.6667 |
1530
+
1531
+ #### Semantic Similarity
1532
+ * Dataset: `sts-test`
1533
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1534
+
1535
+ | Metric | Value |
1536
+ |:--------------------|:-----------|
1537
+ | pearson_cosine | 0.6774 |
1538
+ | **spearman_cosine** | **0.6537** |
1539
+ | pearson_manhattan | 0.6825 |
1540
+ | spearman_manhattan | 0.6325 |
1541
+ | pearson_euclidean | 0.6906 |
1542
+ | spearman_euclidean | 0.6407 |
1543
+ | pearson_dot | 0.5835 |
1544
+ | spearman_dot | 0.5962 |
1545
+ | pearson_max | 0.6906 |
1546
+ | spearman_max | 0.6537 |
1547
+
1548
+ #### Semantic Similarity
1549
+ * Dataset: `sts-test`
1550
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1551
+
1552
+ | Metric | Value |
1553
+ |:--------------------|:-----------|
1554
+ | pearson_cosine | 0.6709 |
1555
+ | **spearman_cosine** | **0.6847** |
1556
+ | pearson_manhattan | 0.6613 |
1557
+ | spearman_manhattan | 0.6907 |
1558
+ | pearson_euclidean | 0.6607 |
1559
+ | spearman_euclidean | 0.6881 |
1560
+ | pearson_dot | 0.6098 |
1561
+ | spearman_dot | 0.6195 |
1562
+ | pearson_max | 0.6709 |
1563
+ | spearman_max | 0.6907 |
1564
+
1565
+ #### Semantic Similarity
1566
+ * Dataset: `sts-test`
1567
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1568
+
1569
+ | Metric | Value |
1570
+ |:--------------------|:-----------|
1571
+ | pearson_cosine | 0.5977 |
1572
+ | **spearman_cosine** | **0.5799** |
1573
+ | pearson_manhattan | 0.5974 |
1574
+ | spearman_manhattan | 0.5953 |
1575
+ | pearson_euclidean | 0.5949 |
1576
+ | spearman_euclidean | 0.5936 |
1577
+ | pearson_dot | 0.5043 |
1578
+ | spearman_dot | 0.4968 |
1579
+ | pearson_max | 0.5977 |
1580
+ | spearman_max | 0.5953 |
1581
+
1582
+ #### Semantic Similarity
1583
+ * Dataset: `sts-test`
1584
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1585
+
1586
+ | Metric | Value |
1587
+ |:--------------------|:-----------|
1588
+ | pearson_cosine | 0.4562 |
1589
+ | **spearman_cosine** | **0.4422** |
1590
+ | pearson_manhattan | 0.4155 |
1591
+ | spearman_manhattan | 0.3837 |
1592
+ | pearson_euclidean | 0.4111 |
1593
+ | spearman_euclidean | 0.3822 |
1594
+ | pearson_dot | 0.4863 |
1595
+ | spearman_dot | 0.5303 |
1596
+ | pearson_max | 0.4863 |
1597
+ | spearman_max | 0.5303 |
1598
+
1599
+ #### Semantic Similarity
1600
+ * Dataset: `sts-test`
1601
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
1602
+
1603
+ | Metric | Value |
1604
+ |:--------------------|:-----------|
1605
+ | pearson_cosine | 0.593 |
1606
+ | **spearman_cosine** | **0.6266** |
1607
+ | pearson_manhattan | 0.5608 |
1608
+ | spearman_manhattan | 0.6229 |
1609
+ | pearson_euclidean | 0.558 |
1610
+ | spearman_euclidean | 0.6202 |
1611
+ | pearson_dot | 0.4578 |
1612
+ | spearman_dot | 0.5628 |
1613
+ | pearson_max | 0.593 |
1614
+ | spearman_max | 0.6266 |
1615
+
1616
+ <!--
1617
+ ## Bias, Risks and Limitations
1618
+
1619
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
1620
+ -->
1621
+
1622
+ <!--
1623
+ ### Recommendations
1624
+
1625
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
1626
+ -->
1627
+
1628
+ ## Training Details
1629
+
1630
+ ### Training Dataset
1631
+
1632
+ #### Unnamed Dataset
1633
+
1634
+
1635
+ * Size: 226,547 training samples
1636
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
1637
+ * Approximate statistics based on the first 1000 samples:
1638
+ | | sentence_0 | sentence_1 | label |
1639
+ |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------|
1640
+ | type | string | string | float |
1641
+ | details | <ul><li>min: 3 tokens</li><li>mean: 20.05 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 19.94 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 1.92</li><li>max: 398.6</li></ul> |
1642
+ * Samples:
1643
+ | sentence_0 | sentence_1 | label |
1644
+ |:-------------------------------------------------------------------|:----------------------------------------------------------------|:---------------------------------|
1645
+ | <code>Bir kadın makineye dikiş dikiyor.</code> | <code>Bir kadın biraz et ekiyor.</code> | <code>0.12</code> |
1646
+ | <code>Snowden 'gegeven vluchtelingendocument door Ecuador'.</code> | <code>Snowden staat op het punt om uit Moskou te vliegen</code> | <code>0.24000000953674316</code> |
1647
+ | <code>Czarny pies idzie mostem przez wodę</code> | <code>Czarny pies nie idzie mostem przez wodę</code> | <code>0.74000000954</code> |
1648
+ * Loss: [<code>AnglELoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#angleloss) with these parameters:
1649
+ ```json
1650
+ {
1651
+ "scale": 20.0,
1652
+ "similarity_fct": "pairwise_angle_sim"
1653
+ }
1654
+ ```
1655
+
1656
+ ### Training Hyperparameters
1657
+ #### Non-Default Hyperparameters
1658
+
1659
+ - `per_device_train_batch_size`: 256
1660
+ - `per_device_eval_batch_size`: 256
1661
+ - `num_train_epochs`: 10
1662
+ - `multi_dataset_batch_sampler`: round_robin
1663
+
1664
+ #### All Hyperparameters
1665
+ <details><summary>Click to expand</summary>
1666
+
1667
+ - `overwrite_output_dir`: False
1668
+ - `do_predict`: False
1669
+ - `prediction_loss_only`: True
1670
+ - `per_device_train_batch_size`: 256
1671
+ - `per_device_eval_batch_size`: 256
1672
+ - `per_gpu_train_batch_size`: None
1673
+ - `per_gpu_eval_batch_size`: None
1674
+ - `gradient_accumulation_steps`: 1
1675
+ - `eval_accumulation_steps`: None
1676
+ - `learning_rate`: 5e-05
1677
+ - `weight_decay`: 0.0
1678
+ - `adam_beta1`: 0.9
1679
+ - `adam_beta2`: 0.999
1680
+ - `adam_epsilon`: 1e-08
1681
+ - `max_grad_norm`: 1
1682
+ - `num_train_epochs`: 10
1683
+ - `max_steps`: -1
1684
+ - `lr_scheduler_type`: linear
1685
+ - `lr_scheduler_kwargs`: {}
1686
+ - `warmup_ratio`: 0.0
1687
+ - `warmup_steps`: 0
1688
+ - `log_level`: passive
1689
+ - `log_level_replica`: warning
1690
+ - `log_on_each_node`: True
1691
+ - `logging_nan_inf_filter`: True
1692
+ - `save_safetensors`: True
1693
+ - `save_on_each_node`: False
1694
+ - `save_only_model`: False
1695
+ - `no_cuda`: False
1696
+ - `use_cpu`: False
1697
+ - `use_mps_device`: False
1698
+ - `seed`: 42
1699
+ - `data_seed`: None
1700
+ - `jit_mode_eval`: False
1701
+ - `use_ipex`: False
1702
+ - `bf16`: False
1703
+ - `fp16`: False
1704
+ - `fp16_opt_level`: O1
1705
+ - `half_precision_backend`: auto
1706
+ - `bf16_full_eval`: False
1707
+ - `fp16_full_eval`: False
1708
+ - `tf32`: None
1709
+ - `local_rank`: 0
1710
+ - `ddp_backend`: None
1711
+ - `tpu_num_cores`: None
1712
+ - `tpu_metrics_debug`: False
1713
+ - `debug`: []
1714
+ - `dataloader_drop_last`: False
1715
+ - `dataloader_num_workers`: 0
1716
+ - `dataloader_prefetch_factor`: None
1717
+ - `past_index`: -1
1718
+ - `disable_tqdm`: False
1719
+ - `remove_unused_columns`: True
1720
+ - `label_names`: None
1721
+ - `load_best_model_at_end`: False
1722
+ - `ignore_data_skip`: False
1723
+ - `fsdp`: []
1724
+ - `fsdp_min_num_params`: 0
1725
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
1726
+ - `fsdp_transformer_layer_cls_to_wrap`: None
1727
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None}
1728
+ - `deepspeed`: None
1729
+ - `label_smoothing_factor`: 0.0
1730
+ - `optim`: adamw_torch
1731
+ - `optim_args`: None
1732
+ - `adafactor`: False
1733
+ - `group_by_length`: False
1734
+ - `length_column_name`: length
1735
+ - `ddp_find_unused_parameters`: None
1736
+ - `ddp_bucket_cap_mb`: None
1737
+ - `ddp_broadcast_buffers`: False
1738
+ - `dataloader_pin_memory`: True
1739
+ - `dataloader_persistent_workers`: False
1740
+ - `skip_memory_metrics`: True
1741
+ - `use_legacy_prediction_loop`: False
1742
+ - `push_to_hub`: False
1743
+ - `resume_from_checkpoint`: None
1744
+ - `hub_model_id`: None
1745
+ - `hub_strategy`: every_save
1746
+ - `hub_private_repo`: False
1747
+ - `hub_always_push`: False
1748
+ - `gradient_checkpointing`: False
1749
+ - `gradient_checkpointing_kwargs`: None
1750
+ - `include_inputs_for_metrics`: False
1751
+ - `eval_do_concat_batches`: True
1752
+ - `fp16_backend`: auto
1753
+ - `push_to_hub_model_id`: None
1754
+ - `push_to_hub_organization`: None
1755
+ - `mp_parameters`:
1756
+ - `auto_find_batch_size`: False
1757
+ - `full_determinism`: False
1758
+ - `torchdynamo`: None
1759
+ - `ray_scope`: last
1760
+ - `ddp_timeout`: 1800
1761
+ - `torch_compile`: False
1762
+ - `torch_compile_backend`: None
1763
+ - `torch_compile_mode`: None
1764
+ - `dispatch_batches`: None
1765
+ - `split_batches`: None
1766
+ - `include_tokens_per_second`: False
1767
+ - `include_num_input_tokens_seen`: False
1768
+ - `neftune_noise_alpha`: None
1769
+ - `optim_target_modules`: None
1770
+ - `batch_sampler`: batch_sampler
1771
+ - `multi_dataset_batch_sampler`: round_robin
1772
+
1773
+ </details>
1774
+
1775
+ ### Training Logs
1776
+ | Epoch | Step | Training Loss | sts-dev_spearman_cosine | sts-test_spearman_cosine |
1777
+ |:------:|:----:|:-------------:|:-----------------------:|:------------------------:|
1778
+ | 0.5650 | 500 | 10.9426 | - | - |
1779
+ | 1.0 | 885 | - | 0.9202 | - |
1780
+ | 1.1299 | 1000 | 9.7184 | - | - |
1781
+ | 1.6949 | 1500 | 9.5348 | - | - |
1782
+ | 2.0 | 1770 | - | 0.9400 | - |
1783
+ | 2.2599 | 2000 | 9.4412 | - | - |
1784
+ | 2.8249 | 2500 | 9.3097 | - | - |
1785
+ | 3.0 | 2655 | - | 0.9489 | - |
1786
+ | 3.3898 | 3000 | 9.2357 | - | - |
1787
+ | 3.9548 | 3500 | 9.1594 | - | - |
1788
+ | 4.0 | 3540 | - | 0.9528 | - |
1789
+ | 4.5198 | 4000 | 9.0963 | - | - |
1790
+ | 5.0 | 4425 | - | 0.9553 | - |
1791
+ | 5.0847 | 4500 | 9.0382 | - | - |
1792
+ | 5.6497 | 5000 | 8.9837 | - | - |
1793
+ | 6.0 | 5310 | - | 0.9567 | - |
1794
+ | 6.2147 | 5500 | 8.9403 | - | - |
1795
+ | 6.7797 | 6000 | 8.8841 | - | - |
1796
+ | 7.0 | 6195 | - | 0.9581 | - |
1797
+ | 7.3446 | 6500 | 8.8513 | - | - |
1798
+ | 7.9096 | 7000 | 8.81 | - | - |
1799
+ | 8.0 | 7080 | - | 0.9582 | - |
1800
+ | 8.4746 | 7500 | 8.8069 | - | - |
1801
+ | 9.0 | 7965 | - | 0.9589 | - |
1802
+ | 9.0395 | 8000 | 8.7616 | - | - |
1803
+ | 9.6045 | 8500 | 8.7521 | - | - |
1804
+ | 10.0 | 8850 | - | 0.9593 | 0.6266 |
1805
+
1806
+
1807
+ ### Framework Versions
1808
+ - Python: 3.9.7
1809
+ - Sentence Transformers: 3.0.0
1810
+ - Transformers: 4.40.1
1811
+ - PyTorch: 2.3.0+cu121
1812
+ - Accelerate: 0.29.3
1813
+ - Datasets: 2.19.0
1814
+ - Tokenizers: 0.19.1
1815
+
1816
+ ## Citation
1817
+
1818
+ ### BibTeX
1819
+
1820
+ #### Sentence Transformers
1821
+ ```bibtex
1822
+ @inproceedings{reimers-2019-sentence-bert,
1823
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1824
+ author = "Reimers, Nils and Gurevych, Iryna",
1825
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1826
+ month = "11",
1827
+ year = "2019",
1828
+ publisher = "Association for Computational Linguistics",
1829
+ url = "https://arxiv.org/abs/1908.10084",
1830
+ }
1831
+ ```
1832
+
1833
+ #### AnglELoss
1834
+ ```bibtex
1835
+ @misc{li2023angleoptimized,
1836
+ title={AnglE-optimized Text Embeddings},
1837
+ author={Xianming Li and Jing Li},
1838
+ year={2023},
1839
+ eprint={2309.12871},
1840
+ archivePrefix={arXiv},
1841
+ primaryClass={cs.CL}
1842
+ }
1843
+ ```
1844
+
1845
+ <!--
1846
+ ## Glossary
1847
+
1848
+ *Clearly define terms in order to be accessible across audiences.*
1849
+ -->
1850
+
1851
+ <!--
1852
+ ## Model Card Authors
1853
+
1854
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1855
+ -->
1856
+
1857
+ <!--
1858
+ ## Model Card Contact
1859
+
1860
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1861
+ -->
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/paraphrase-multilingual-mpnet-base-v2",
3
+ "architectures": [
4
+ "XLMRobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "gradient_checkpointing": false,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 768,
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 3072,
16
+ "layer_norm_eps": 1e-05,
17
+ "max_position_embeddings": 514,
18
+ "model_type": "xlm-roberta",
19
+ "num_attention_heads": 12,
20
+ "num_hidden_layers": 12,
21
+ "output_past": true,
22
+ "pad_token_id": 1,
23
+ "position_embedding_type": "absolute",
24
+ "torch_dtype": "float32",
25
+ "transformers_version": "4.40.1",
26
+ "type_vocab_size": 1,
27
+ "use_cache": true,
28
+ "vocab_size": 250002
29
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.7.0",
5
+ "pytorch": "1.9.0+cu102"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:366773467a69089fa27001df7a16ff5a033e9063e78826f03c77cd102fa162ce
3
+ size 1112197096
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
sentencepiece.bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cfc8146abe2a0488e9e2a0c56de7952f7c11ab059eca145a0a727afce0db2865
3
+ size 5069051
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cad551d5600a84242d0973327029452a1e3672ba6313c2a3c3d69c4310e12719
3
+ size 17082987
tokenizer_config.json ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "250001": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "eos_token": "</s>",
48
+ "mask_token": "<mask>",
49
+ "max_length": 128,
50
+ "model_max_length": 128,
51
+ "pad_to_multiple_of": null,
52
+ "pad_token": "<pad>",
53
+ "pad_token_type_id": 0,
54
+ "padding_side": "right",
55
+ "sep_token": "</s>",
56
+ "stride": 0,
57
+ "tokenizer_class": "XLMRobertaTokenizer",
58
+ "truncation_side": "right",
59
+ "truncation_strategy": "longest_first",
60
+ "unk_token": "<unk>"
61
+ }