orionweller commited on
Commit
aeb9d60
2 Parent(s): 0d0563c 44ed3a2

merge in main

Browse files
Files changed (6) hide show
  1. .gitignore +2 -1
  2. EXTERNAL_MODEL_RESULTS.json +0 -0
  3. app.py +0 -0
  4. config.yaml +389 -0
  5. envs.py +48 -0
  6. model_meta.yaml +1308 -0
.gitignore CHANGED
@@ -1 +1,2 @@
1
- *.pyc
 
 
1
+ *.pyc
2
+ model_infos.json
EXTERNAL_MODEL_RESULTS.json CHANGED
The diff for this file is too large to render. See raw diff
 
app.py CHANGED
The diff for this file is too large to render. See raw diff
 
config.yaml ADDED
@@ -0,0 +1,389 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ config:
2
+ REPO_ID: "mteb/leaderboard"
3
+ RESULTS_REPO: mteb/results
4
+ LEADERBOARD_NAME: "MTEB Leaderboard"
5
+ tasks:
6
+ BitextMining:
7
+ icon: "🎌"
8
+ metric: f1
9
+ metric_description: "[F1](https://huggingface.co/spaces/evaluate-metric/f1)"
10
+ task_description: "Bitext mining is the task of finding parallel sentences in two languages."
11
+ Classification:
12
+ icon: "❤️"
13
+ metric: accuracy
14
+ metric_description: "[Accuracy](https://huggingface.co/spaces/evaluate-metric/accuracy)"
15
+ task_description: "Classification is the task of assigning a label to a text."
16
+ Clustering:
17
+ icon: "✨"
18
+ metric: v_measure
19
+ metric_description: "Validity Measure (v_measure)"
20
+ task_description: "Clustering is the task of grouping similar documents together."
21
+ PairClassification:
22
+ icon: "🎭"
23
+ metric: cos_sim_ap
24
+ metric_description: "Average Precision based on Cosine Similarities (cos_sim_ap)"
25
+ task_description: "Pair classification is the task of determining whether two texts are similar."
26
+ Reranking:
27
+ icon: "🥈"
28
+ metric: map
29
+ metric_description: "Mean Average Precision (MAP)"
30
+ task_description: "Reranking is the task of reordering a list of documents to improve relevance."
31
+ Retrieval:
32
+ icon: "🔎"
33
+ metric: ndcg_at_10
34
+ metric_description: "Normalized Discounted Cumulative Gain @ k (ndcg_at_10)"
35
+ task_description: "Retrieval is the task of finding relevant documents for a query."
36
+ STS:
37
+ icon: "🤖"
38
+ metric: cos_sim_spearman
39
+ metric_description: "Spearman correlation based on cosine similarity"
40
+ task_description: "Semantic Textual Similarity is the task of determining how similar two texts are."
41
+ Summarization:
42
+ icon: "📜"
43
+ metric: cos_sim_spearman
44
+ metric_description: "Spearman correlation based on cosine similarity"
45
+ task_description: "Summarization is the task of generating a summary of a text."
46
+ InstructionRetrieval:
47
+ icon: "🔎📋"
48
+ metric: "p-MRR"
49
+ metric_description: "paired mean reciprocal rank"
50
+ task_description: "Retrieval w/Instructions is the task of finding relevant documents for a query that has detailed instructions."
51
+ boards:
52
+ en:
53
+ title: English
54
+ language_long: "English"
55
+ has_overall: true
56
+ acronym: null
57
+ icon: null
58
+ special_icons: null
59
+ credits: null
60
+ tasks:
61
+ Classification:
62
+ - AmazonCounterfactualClassification (en)
63
+ - AmazonPolarityClassification
64
+ - AmazonReviewsClassification (en)
65
+ - Banking77Classification
66
+ - EmotionClassification
67
+ - ImdbClassification
68
+ - MassiveIntentClassification (en)
69
+ - MassiveScenarioClassification (en)
70
+ - MTOPDomainClassification (en)
71
+ - MTOPIntentClassification (en)
72
+ - ToxicConversationsClassification
73
+ - TweetSentimentExtractionClassification
74
+ Clustering:
75
+ - ArxivClusteringP2P
76
+ - ArxivClusteringS2S
77
+ - BiorxivClusteringP2P
78
+ - BiorxivClusteringS2S
79
+ - MedrxivClusteringP2P
80
+ - MedrxivClusteringS2S
81
+ - RedditClustering
82
+ - RedditClusteringP2P
83
+ - StackExchangeClustering
84
+ - StackExchangeClusteringP2P
85
+ - TwentyNewsgroupsClustering
86
+ PairClassification:
87
+ - SprintDuplicateQuestions
88
+ - TwitterSemEval2015
89
+ - TwitterURLCorpus
90
+ Reranking:
91
+ - AskUbuntuDupQuestions
92
+ - MindSmallReranking
93
+ - SciDocsRR
94
+ - StackOverflowDupQuestions
95
+ Retrieval:
96
+ - ArguAna
97
+ - ClimateFEVER
98
+ - CQADupstackRetrieval
99
+ - DBPedia
100
+ - FEVER
101
+ - FiQA2018
102
+ - HotpotQA
103
+ - MSMARCO
104
+ - NFCorpus
105
+ - NQ
106
+ - QuoraRetrieval
107
+ - SCIDOCS
108
+ - SciFact
109
+ - Touche2020
110
+ - TRECCOVID
111
+ STS:
112
+ - BIOSSES
113
+ - SICK-R
114
+ - STS12
115
+ - STS13
116
+ - STS14
117
+ - STS15
118
+ - STS16
119
+ - STS17 (en-en)
120
+ - STS22 (en)
121
+ - STSBenchmark
122
+ Summarization:
123
+ - SummEval
124
+ en-x:
125
+ title: "English-X"
126
+ language_long: "117 (Pairs of: English & other language)"
127
+ has_overall: false
128
+ acronym: null
129
+ icon: null
130
+ special_icons: null
131
+ credits: null
132
+ tasks:
133
+ BitextMining: ['BUCC (de-en)', 'BUCC (fr-en)', 'BUCC (ru-en)', 'BUCC (zh-en)', 'Tatoeba (afr-eng)', 'Tatoeba (amh-eng)', 'Tatoeba (ang-eng)', 'Tatoeba (ara-eng)', 'Tatoeba (arq-eng)', 'Tatoeba (arz-eng)', 'Tatoeba (ast-eng)', 'Tatoeba (awa-eng)', 'Tatoeba (aze-eng)', 'Tatoeba (bel-eng)', 'Tatoeba (ben-eng)', 'Tatoeba (ber-eng)', 'Tatoeba (bos-eng)', 'Tatoeba (bre-eng)', 'Tatoeba (bul-eng)', 'Tatoeba (cat-eng)', 'Tatoeba (cbk-eng)', 'Tatoeba (ceb-eng)', 'Tatoeba (ces-eng)', 'Tatoeba (cha-eng)', 'Tatoeba (cmn-eng)', 'Tatoeba (cor-eng)', 'Tatoeba (csb-eng)', 'Tatoeba (cym-eng)', 'Tatoeba (dan-eng)', 'Tatoeba (deu-eng)', 'Tatoeba (dsb-eng)', 'Tatoeba (dtp-eng)', 'Tatoeba (ell-eng)', 'Tatoeba (epo-eng)', 'Tatoeba (est-eng)', 'Tatoeba (eus-eng)', 'Tatoeba (fao-eng)', 'Tatoeba (fin-eng)', 'Tatoeba (fra-eng)', 'Tatoeba (fry-eng)', 'Tatoeba (gla-eng)', 'Tatoeba (gle-eng)', 'Tatoeba (glg-eng)', 'Tatoeba (gsw-eng)', 'Tatoeba (heb-eng)', 'Tatoeba (hin-eng)', 'Tatoeba (hrv-eng)', 'Tatoeba (hsb-eng)', 'Tatoeba (hun-eng)', 'Tatoeba (hye-eng)', 'Tatoeba (ido-eng)', 'Tatoeba (ile-eng)', 'Tatoeba (ina-eng)', 'Tatoeba (ind-eng)', 'Tatoeba (isl-eng)', 'Tatoeba (ita-eng)', 'Tatoeba (jav-eng)', 'Tatoeba (jpn-eng)', 'Tatoeba (kab-eng)', 'Tatoeba (kat-eng)', 'Tatoeba (kaz-eng)', 'Tatoeba (khm-eng)', 'Tatoeba (kor-eng)', 'Tatoeba (kur-eng)', 'Tatoeba (kzj-eng)', 'Tatoeba (lat-eng)', 'Tatoeba (lfn-eng)', 'Tatoeba (lit-eng)', 'Tatoeba (lvs-eng)', 'Tatoeba (mal-eng)', 'Tatoeba (mar-eng)', 'Tatoeba (max-eng)', 'Tatoeba (mhr-eng)', 'Tatoeba (mkd-eng)', 'Tatoeba (mon-eng)', 'Tatoeba (nds-eng)', 'Tatoeba (nld-eng)', 'Tatoeba (nno-eng)', 'Tatoeba (nob-eng)', 'Tatoeba (nov-eng)', 'Tatoeba (oci-eng)', 'Tatoeba (orv-eng)', 'Tatoeba (pam-eng)', 'Tatoeba (pes-eng)', 'Tatoeba (pms-eng)', 'Tatoeba (pol-eng)', 'Tatoeba (por-eng)', 'Tatoeba (ron-eng)', 'Tatoeba (rus-eng)', 'Tatoeba (slk-eng)', 'Tatoeba (slv-eng)', 'Tatoeba (spa-eng)', 'Tatoeba (sqi-eng)', 'Tatoeba (srp-eng)', 'Tatoeba (swe-eng)', 'Tatoeba (swg-eng)', 'Tatoeba (swh-eng)', 'Tatoeba (tam-eng)', 'Tatoeba (tat-eng)', 'Tatoeba (tel-eng)', 'Tatoeba (tgl-eng)', 'Tatoeba (tha-eng)', 'Tatoeba (tuk-eng)', 'Tatoeba (tur-eng)', 'Tatoeba (tzl-eng)', 'Tatoeba (uig-eng)', 'Tatoeba (ukr-eng)', 'Tatoeba (urd-eng)', 'Tatoeba (uzb-eng)', 'Tatoeba (vie-eng)', 'Tatoeba (war-eng)', 'Tatoeba (wuu-eng)', 'Tatoeba (xho-eng)', 'Tatoeba (yid-eng)', 'Tatoeba (yue-eng)', 'Tatoeba (zsm-eng)']
134
+ zh:
135
+ title: Chinese
136
+ language_long: Chinese
137
+ has_overall: true
138
+ acronym: C-MTEB
139
+ icon: "🇨🇳"
140
+ special_icons:
141
+ Classification: "🧡"
142
+ credits: "[FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding)"
143
+ tasks:
144
+ Classification:
145
+ - AmazonReviewsClassification (zh)
146
+ - IFlyTek
147
+ - JDReview
148
+ - MassiveIntentClassification (zh-CN)
149
+ - MassiveScenarioClassification (zh-CN)
150
+ - MultilingualSentiment
151
+ - OnlineShopping
152
+ - TNews
153
+ - Waimai
154
+ Clustering:
155
+ - CLSClusteringP2P
156
+ - CLSClusteringS2S
157
+ - ThuNewsClusteringP2P
158
+ - ThuNewsClusteringS2S
159
+ PairClassification:
160
+ - Cmnli
161
+ - Ocnli
162
+ Reranking:
163
+ - CMedQAv1
164
+ - CMedQAv2
165
+ - MMarcoReranking
166
+ - T2Reranking
167
+ Retrieval:
168
+ - CmedqaRetrieval
169
+ - CovidRetrieval
170
+ - DuRetrieval
171
+ - EcomRetrieval
172
+ - MedicalRetrieval
173
+ - MMarcoRetrieval
174
+ - T2Retrieval
175
+ - VideoRetrieval
176
+ STS:
177
+ - AFQMC
178
+ - ATEC
179
+ - BQ
180
+ - LCQMC
181
+ - PAWSX
182
+ - QBQTC
183
+ - STS22 (zh)
184
+ - STSB
185
+ da:
186
+ title: Danish
187
+ language_long: Danish
188
+ has_overall: false
189
+ acronym: null
190
+ icon: "🇩🇰"
191
+ special_icons:
192
+ Classification: "🤍"
193
+ credits: "[Kenneth Enevoldsen](https://github.com/KennethEnevoldsen), [scandinavian-embedding-benchmark](https://kennethenevoldsen.github.io/scandinavian-embedding-benchmark/)"
194
+ tasks:
195
+ BitextMining:
196
+ - BornholmBitextMining
197
+ Classification:
198
+ - AngryTweetsClassification
199
+ - DanishPoliticalCommentsClassification
200
+ - DKHateClassification
201
+ - LccSentimentClassification
202
+ - MassiveIntentClassification (da)
203
+ - MassiveScenarioClassification (da)
204
+ - NordicLangClassification
205
+ - ScalaDaClassification
206
+ fr:
207
+ title: French
208
+ language_long: "French"
209
+ has_overall: true
210
+ acronym: "F-MTEB"
211
+ icon: "🇫🇷"
212
+ special_icons:
213
+ Classification: "💙"
214
+ credits: "[Lyon-NLP](https://github.com/Lyon-NLP): [Gabriel Sequeira](https://github.com/GabrielSequeira), [Imene Kerboua](https://github.com/imenelydiaker), [Wissam Siblini](https://github.com/wissam-sib), [Mathieu Ciancone](https://github.com/MathieuCiancone), [Marion Schaeffer](https://github.com/schmarion)"
215
+ tasks:
216
+ Classification:
217
+ - AmazonReviewsClassification (fr)
218
+ - MasakhaNEWSClassification (fra)
219
+ - MassiveIntentClassification (fr)
220
+ - MassiveScenarioClassification (fr)
221
+ - MTOPDomainClassification (fr)
222
+ - MTOPIntentClassification (fr)
223
+ Clustering:
224
+ - AlloProfClusteringP2P
225
+ - AlloProfClusteringS2S
226
+ - HALClusteringS2S
227
+ - MLSUMClusteringP2P
228
+ - MLSUMClusteringS2S
229
+ - MasakhaNEWSClusteringP2P (fra)
230
+ - MasakhaNEWSClusteringS2S (fra)
231
+ PairClassification:
232
+ - OpusparcusPC (fr)
233
+ - PawsX (fr)
234
+ Reranking:
235
+ - AlloprofReranking
236
+ - SyntecReranking
237
+ Retrieval:
238
+ - AlloprofRetrieval
239
+ - BSARDRetrieval
240
+ - MintakaRetrieval (fr)
241
+ - SyntecRetrieval
242
+ - XPQARetrieval (fr)
243
+ STS:
244
+ - STS22 (fr)
245
+ - STSBenchmarkMultilingualSTS (fr)
246
+ - SICKFr
247
+ Summarization:
248
+ - SummEvalFr
249
+ 'no':
250
+ title: Norwegian
251
+ language_long: "Norwegian Bokmål"
252
+ has_overall: false
253
+ acronym: null
254
+ icon: "🇳🇴"
255
+ special_icons:
256
+ Classification: "💙"
257
+ credits: "[Kenneth Enevoldsen](https://github.com/KennethEnevoldsen), [scandinavian-embedding-benchmark](https://kennethenevoldsen.github.io/scandinavian-embedding-benchmark/)"
258
+ tasks:
259
+ Classification: &id001
260
+ - NoRecClassification
261
+ - NordicLangClassification
262
+ - NorwegianParliament
263
+ - MassiveIntentClassification (nb)
264
+ - MassiveScenarioClassification (nb)
265
+ - ScalaNbClassification
266
+ instructions:
267
+ title: English
268
+ language_long: "English"
269
+ has_overall: TASK_LIST_RETRIEVAL_INSTRUCTIONS
270
+ acronym: null
271
+ icon: null
272
+ credits: "[Orion Weller, FollowIR](https://arxiv.org/abs/2403.15246)"
273
+ tasks:
274
+ InstructionRetrieval:
275
+ - Robust04InstructionRetrieval
276
+ - News21InstructionRetrieval
277
+ - Core17InstructionRetrieval
278
+ law:
279
+ title: Law
280
+ language_long: "English, German, Chinese"
281
+ has_overall: false
282
+ acronym: null
283
+ icon: "⚖️"
284
+ special_icons: null
285
+ credits: "[Voyage AI](https://www.voyageai.com/)"
286
+ tasks:
287
+ Retrieval:
288
+ - AILACasedocs
289
+ - AILAStatutes
290
+ - GerDaLIRSmall
291
+ - LeCaRDv2
292
+ - LegalBenchConsumerContractsQA
293
+ - LegalBenchCorporateLobbying
294
+ - LegalQuAD
295
+ - LegalSummarization
296
+ de:
297
+ title: German
298
+ language_long: "German"
299
+ has_overall: false
300
+ acronym: null
301
+ icon: "🇩🇪"
302
+ special_icons: null
303
+ credits: "[Silvan](https://github.com/slvnwhrl)"
304
+ tasks:
305
+ Clustering:
306
+ - BlurbsClusteringP2P
307
+ - BlurbsClusteringS2S
308
+ - TenKGnadClusteringP2P
309
+ - TenKGnadClusteringS2S
310
+ pl:
311
+ title: Polish
312
+ language_long: Polish
313
+ has_overall: true
314
+ acronym: null
315
+ icon: "🇵🇱"
316
+ special_icons:
317
+ Classification: "🤍"
318
+ credits: "[Rafał Poświata](https://github.com/rafalposwiata)"
319
+ tasks:
320
+ Classification:
321
+ - AllegroReviews
322
+ - CBD
323
+ - MassiveIntentClassification (pl)
324
+ - MassiveScenarioClassification (pl)
325
+ - PAC
326
+ - PolEmo2.0-IN
327
+ - PolEmo2.0-OUT
328
+ Clustering:
329
+ - 8TagsClustering
330
+ PairClassification:
331
+ - CDSC-E
332
+ - PPC
333
+ - PSC
334
+ - SICK-E-PL
335
+ Retrieval:
336
+ - ArguAna-PL
337
+ - DBPedia-PL
338
+ - FiQA-PL
339
+ - HotpotQA-PL
340
+ - MSMARCO-PL
341
+ - NFCorpus-PL
342
+ - NQ-PL
343
+ - Quora-PL
344
+ - SCIDOCS-PL
345
+ - SciFact-PL
346
+ - TRECCOVID-PL
347
+ STS:
348
+ - CDSC-R
349
+ - SICK-R-PL
350
+ - STS22 (pl)
351
+ se:
352
+ title: Swedish
353
+ language_long: Swedish
354
+ has_overall: false
355
+ acronym: null
356
+ icon: "🇸🇪"
357
+ special_icons:
358
+ Classification: "💛"
359
+ credits: "[Kenneth Enevoldsen](https://github.com/KennethEnevoldsen), [scandinavian-embedding-benchmark](https://kennethenevoldsen.github.io/scandinavian-embedding-benchmark/)"
360
+ tasks:
361
+ Classification:
362
+ - NoRecClassification
363
+ - NordicLangClassification
364
+ - NorwegianParliament
365
+ - MassiveIntentClassification (nb)
366
+ - MassiveScenarioClassification (nb)
367
+ - ScalaNbClassification
368
+ other-cls:
369
+ title: "Other Languages"
370
+ language_long: "47 (Only languages not included in the other tabs)"
371
+ has_overall: false
372
+ acronym: null
373
+ icon: null
374
+ special_icons:
375
+ Classification: "💜💚💙"
376
+ credits: null
377
+ tasks:
378
+ Classification: ['AmazonCounterfactualClassification (de)', 'AmazonCounterfactualClassification (ja)', 'AmazonReviewsClassification (de)', 'AmazonReviewsClassification (es)', 'AmazonReviewsClassification (fr)', 'AmazonReviewsClassification (ja)', 'AmazonReviewsClassification (zh)', 'MTOPDomainClassification (de)', 'MTOPDomainClassification (es)', 'MTOPDomainClassification (fr)', 'MTOPDomainClassification (hi)', 'MTOPDomainClassification (th)', 'MTOPIntentClassification (de)', 'MTOPIntentClassification (es)', 'MTOPIntentClassification (fr)', 'MTOPIntentClassification (hi)', 'MTOPIntentClassification (th)', 'MassiveIntentClassification (af)', 'MassiveIntentClassification (am)', 'MassiveIntentClassification (ar)', 'MassiveIntentClassification (az)', 'MassiveIntentClassification (bn)', 'MassiveIntentClassification (cy)', 'MassiveIntentClassification (de)', 'MassiveIntentClassification (el)', 'MassiveIntentClassification (es)', 'MassiveIntentClassification (fa)', 'MassiveIntentClassification (fi)', 'MassiveIntentClassification (fr)', 'MassiveIntentClassification (he)', 'MassiveIntentClassification (hi)', 'MassiveIntentClassification (hu)', 'MassiveIntentClassification (hy)', 'MassiveIntentClassification (id)', 'MassiveIntentClassification (is)', 'MassiveIntentClassification (it)', 'MassiveIntentClassification (ja)', 'MassiveIntentClassification (jv)', 'MassiveIntentClassification (ka)', 'MassiveIntentClassification (km)', 'MassiveIntentClassification (kn)', 'MassiveIntentClassification (ko)', 'MassiveIntentClassification (lv)', 'MassiveIntentClassification (ml)', 'MassiveIntentClassification (mn)', 'MassiveIntentClassification (ms)', 'MassiveIntentClassification (my)', 'MassiveIntentClassification (nl)', 'MassiveIntentClassification (pt)', 'MassiveIntentClassification (ro)', 'MassiveIntentClassification (ru)', 'MassiveIntentClassification (sl)', 'MassiveIntentClassification (sq)', 'MassiveIntentClassification (sw)', 'MassiveIntentClassification (ta)', 'MassiveIntentClassification (te)', 'MassiveIntentClassification (th)', 'MassiveIntentClassification (tl)', 'MassiveIntentClassification (tr)', 'MassiveIntentClassification (ur)', 'MassiveIntentClassification (vi)', 'MassiveIntentClassification (zh-TW)', 'MassiveScenarioClassification (af)', 'MassiveScenarioClassification (am)', 'MassiveScenarioClassification (ar)', 'MassiveScenarioClassification (az)', 'MassiveScenarioClassification (bn)', 'MassiveScenarioClassification (cy)', 'MassiveScenarioClassification (de)', 'MassiveScenarioClassification (el)', 'MassiveScenarioClassification (es)', 'MassiveScenarioClassification (fa)', 'MassiveScenarioClassification (fi)', 'MassiveScenarioClassification (fr)', 'MassiveScenarioClassification (he)', 'MassiveScenarioClassification (hi)', 'MassiveScenarioClassification (hu)', 'MassiveScenarioClassification (hy)', 'MassiveScenarioClassification (id)', 'MassiveScenarioClassification (is)', 'MassiveScenarioClassification (it)', 'MassiveScenarioClassification (ja)', 'MassiveScenarioClassification (jv)', 'MassiveScenarioClassification (ka)', 'MassiveScenarioClassification (km)', 'MassiveScenarioClassification (kn)', 'MassiveScenarioClassification (ko)', 'MassiveScenarioClassification (lv)', 'MassiveScenarioClassification (ml)', 'MassiveScenarioClassification (mn)', 'MassiveScenarioClassification (ms)', 'MassiveScenarioClassification (my)', 'MassiveScenarioClassification (nl)', 'MassiveScenarioClassification (pt)', 'MassiveScenarioClassification (ro)', 'MassiveScenarioClassification (ru)', 'MassiveScenarioClassification (sl)', 'MassiveScenarioClassification (sq)', 'MassiveScenarioClassification (sw)', 'MassiveScenarioClassification (ta)', 'MassiveScenarioClassification (te)', 'MassiveScenarioClassification (th)', 'MassiveScenarioClassification (tl)', 'MassiveScenarioClassification (tr)', 'MassiveScenarioClassification (ur)', 'MassiveScenarioClassification (vi)', 'MassiveScenarioClassification (zh-TW)']
379
+ other-sts:
380
+ title: Other
381
+ language_long: "Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Russian, Spanish (Only language combos not included in the other tabs)"
382
+ has_overall: false
383
+ acronym: null
384
+ icon: null
385
+ special_icons:
386
+ STS: "👽"
387
+ credits: null
388
+ tasks:
389
+ STS: ["STS17 (ar-ar)", "STS17 (en-ar)", "STS17 (en-de)", "STS17 (en-tr)", "STS17 (es-en)", "STS17 (es-es)", "STS17 (fr-en)", "STS17 (it-en)", "STS17 (ko-ko)", "STS17 (nl-en)", "STS22 (ar)", "STS22 (de)", "STS22 (de-en)", "STS22 (de-fr)", "STS22 (de-pl)", "STS22 (es)", "STS22 (es-en)", "STS22 (es-it)", "STS22 (fr)", "STS22 (fr-pl)", "STS22 (it)", "STS22 (pl)", "STS22 (pl-en)", "STS22 (ru)", "STS22 (tr)", "STS22 (zh-en)", "STSBenchmark"]
envs.py ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from yaml import safe_load
3
+
4
+ from huggingface_hub import HfApi
5
+
6
+ LEADERBOARD_CONFIG_PATH = "config.yaml"
7
+ with open(LEADERBOARD_CONFIG_PATH, 'r', encoding='utf-8') as f:
8
+ LEADERBOARD_CONFIG = safe_load(f)
9
+ MODEL_META_PATH = "model_meta.yaml"
10
+ with open(MODEL_META_PATH, 'r', encoding='utf-8') as f:
11
+ MODEL_META = safe_load(f)
12
+
13
+ # Try first to get the config from the environment variables, then from the config.yaml file
14
+ def get_config(name, default):
15
+ res = None
16
+
17
+ if name in os.environ:
18
+ res = os.environ[name]
19
+ elif 'config' in LEADERBOARD_CONFIG:
20
+ res = LEADERBOARD_CONFIG['config'].get(name, None)
21
+
22
+ if res is None:
23
+ return default
24
+ return res
25
+
26
+ def str2bool(v):
27
+ return str(v).lower() in ("yes", "true", "t", "1")
28
+
29
+ # clone / pull the lmeh eval data
30
+ HF_TOKEN = get_config("HF_TOKEN", None)
31
+
32
+ LEADERBOARD_NAME = get_config("LEADERBOARD_NAME", "MTEB Leaderboard")
33
+
34
+ REPO_ID = get_config("REPO_ID", "mteb/leaderboard")
35
+ RESULTS_REPO = get_config("RESULTS_REPO", "mteb/results")
36
+
37
+ CACHE_PATH=get_config("HF_HOME", ".")
38
+ os.environ["HF_HOME"] = CACHE_PATH
39
+
40
+ # Check if it is using persistent storage
41
+ if not os.access(CACHE_PATH, os.W_OK):
42
+ print(f"No write access to HF_HOME: {CACHE_PATH}. Resetting to current directory.")
43
+ CACHE_PATH = "."
44
+ os.environ["HF_HOME"] = CACHE_PATH
45
+ else:
46
+ print(f"Write access confirmed for HF_HOME")
47
+
48
+ API = HfApi(token=HF_TOKEN)
model_meta.yaml ADDED
@@ -0,0 +1,1308 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model_meta:
2
+ Baichuan-text-embedding:
3
+ link: https://platform.baichuan-ai.com/docs/text-Embedding
4
+ seq_len: 512
5
+ size: null
6
+ dim: 1024
7
+ is_external: true
8
+ is_proprietary: true
9
+ is_sentence_transformers_compatible: false
10
+ Cohere-embed-english-v3.0:
11
+ link: https://huggingface.co/Cohere/Cohere-embed-english-v3.0
12
+ seq_len: 512
13
+ size: null
14
+ dim: 1024
15
+ is_external: true
16
+ is_proprietary: true
17
+ is_sentence_transformers_compatible: false
18
+ Cohere-embed-multilingual-light-v3.0:
19
+ link: https://huggingface.co/Cohere/Cohere-embed-multilingual-light-v3.0
20
+ seq_len: 512
21
+ size: null
22
+ dim: 384
23
+ is_external: true
24
+ is_proprietary: true
25
+ is_sentence_transformers_compatible: false
26
+ Cohere-embed-multilingual-v3.0:
27
+ link: https://huggingface.co/Cohere/Cohere-embed-multilingual-v3.0
28
+ seq_len: 512
29
+ size: null
30
+ dim: 1024
31
+ is_external: true
32
+ is_proprietary: true
33
+ is_sentence_transformers_compatible: false
34
+ DanskBERT:
35
+ link: https://huggingface.co/vesteinn/DanskBERT
36
+ seq_len: 514
37
+ size: 125
38
+ dim: 768
39
+ is_external: true
40
+ is_proprietary: false
41
+ is_sentence_transformers_compatible: true
42
+ FollowIR-7B:
43
+ link: https://huggingface.co/jhu-clsp/FollowIR-7B
44
+ seq_len: 4096
45
+ size: 7240
46
+ is_external: true
47
+ is_propietary: false
48
+ is_sentence_transformer_compatible: false
49
+ GritLM-7B:
50
+ link: https://huggingface.co/GritLM/GritLM-7B
51
+ seq_len: 4096
52
+ is_external: true
53
+ is_propietary: false
54
+ is_sentence_transformer_compatible: false
55
+ LASER2:
56
+ link: https://github.com/facebookresearch/LASER
57
+ seq_len: N/A
58
+ size: 43
59
+ dim: 1024
60
+ is_external: true
61
+ is_proprietary: false
62
+ is_sentence_transformers_compatible: false
63
+ LLM2Vec-Llama-2-supervised:
64
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-supervised
65
+ seq_len: 4096
66
+ size: 6607
67
+ dim: 4096
68
+ is_external: true
69
+ is_proprietary: false
70
+ is_sentence_transformers_compatible: false
71
+ LLM2Vec-Llama-2-unsupervised:
72
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-unsup-simcse
73
+ seq_len: 4096
74
+ size: 6607
75
+ dim: 4096
76
+ is_external: true
77
+ is_proprietary: false
78
+ is_sentence_transformers_compatible: false
79
+ LLM2Vec-Meta-Llama-3-supervised:
80
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised
81
+ seq_len: 8192
82
+ size: 7505
83
+ dim: 4096
84
+ is_external: true
85
+ is_proprietary: false
86
+ is_sentence_transformers_compatible: false
87
+ LLM2Vec-Meta-Llama-3-unsupervised:
88
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-unsup-simcse
89
+ seq_len: 8192
90
+ size: 7505
91
+ dim: 4096
92
+ is_external: true
93
+ is_proprietary: false
94
+ is_sentence_transformers_compatible: false
95
+ LLM2Vec-Mistral-supervised:
96
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised
97
+ seq_len: 32768
98
+ size: 7111
99
+ dim: 4096
100
+ is_external: true
101
+ is_proprietary: false
102
+ is_sentence_transformers_compatible: false
103
+ LLM2Vec-Mistral-unsupervised:
104
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse
105
+ seq_len: 32768
106
+ size: 7111
107
+ dim: 4096
108
+ is_external: true
109
+ is_proprietary: false
110
+ is_sentence_transformers_compatible: false
111
+ LLM2Vec-Sheared-Llama-supervised:
112
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Sheared-LLaMA-mntp-supervised
113
+ seq_len: 4096
114
+ size: 1280
115
+ dim: 2048
116
+ is_external: true
117
+ is_proprietary: false
118
+ is_sentence_transformers_compatible: false
119
+ LLM2Vec-Sheared-Llama-unsupervised:
120
+ link: https://huggingface.co/McGill-NLP/LLM2Vec-Sheared-LLaMA-mntp-unsup-simcse
121
+ seq_len: 4096
122
+ size: 1280
123
+ dim: 2048
124
+ is_external: true
125
+ is_proprietary: false
126
+ is_sentence_transformers_compatible: false
127
+ LaBSE:
128
+ link: https://huggingface.co/sentence-transformers/LaBSE
129
+ seq_len: 512
130
+ size: 471
131
+ dim: 768
132
+ is_external: true
133
+ is_proprietary: false
134
+ is_sentence_transformers_compatible: true
135
+ OpenSearch-text-hybrid:
136
+ link: https://help.aliyun.com/zh/open-search/vector-search-edition/hybrid-retrieval
137
+ seq_len: 512
138
+ size: null
139
+ dim: 1792
140
+ is_external: true
141
+ is_proprietary: true
142
+ is_sentence_transformers_compatible: false
143
+ all-MiniLM-L12-v2:
144
+ link: https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2
145
+ seq_len: 512
146
+ size: 33
147
+ dim: 384
148
+ is_external: true
149
+ is_proprietary: false
150
+ is_sentence_transformers_compatible: true
151
+ all-MiniLM-L6-v2:
152
+ link: https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2
153
+ seq_len: 512
154
+ size: 23
155
+ dim: 384
156
+ is_external: true
157
+ is_proprietary: false
158
+ is_sentence_transformers_compatible: true
159
+ all-mpnet-base-v2:
160
+ link: https://huggingface.co/sentence-transformers/all-mpnet-base-v2
161
+ seq_len: 514
162
+ size: 110
163
+ dim: 768
164
+ is_external: true
165
+ is_proprietary: false
166
+ is_sentence_transformers_compatible: true
167
+ allenai-specter:
168
+ link: https://huggingface.co/sentence-transformers/allenai-specter
169
+ seq_len: 512
170
+ size: 110
171
+ dim: 768
172
+ is_external: true
173
+ is_proprietary: false
174
+ is_sentence_transformers_compatible: true
175
+ bert-base-10lang-cased:
176
+ link: https://huggingface.co/Geotrend/bert-base-10lang-cased
177
+ seq_len: 512
178
+ size: 138
179
+ dim: 768
180
+ is_external: true
181
+ is_proprietary: false
182
+ is_sentence_transformers_compatible: true
183
+ bert-base-15lang-cased:
184
+ link: https://huggingface.co/Geotrend/bert-base-15lang-cased
185
+ seq_len: 512
186
+ size: 138
187
+ dim: 768
188
+ is_external: true
189
+ is_proprietary: false
190
+ is_sentence_transformers_compatible: true
191
+ bert-base-25lang-cased:
192
+ link: https://huggingface.co/Geotrend/bert-base-25lang-cased
193
+ seq_len: 512
194
+ size: 138
195
+ dim: 768
196
+ is_external: true
197
+ is_proprietary: false
198
+ is_sentence_transformers_compatible: true
199
+ bert-base-multilingual-cased:
200
+ link: https://huggingface.co/google-bert/bert-base-multilingual-cased
201
+ seq_len: 512
202
+ size: 179
203
+ dim: 768
204
+ is_external: true
205
+ is_proprietary: false
206
+ is_sentence_transformers_compatible: true
207
+ bert-base-multilingual-uncased:
208
+ link: https://huggingface.co/google-bert/bert-base-multilingual-uncased
209
+ seq_len: 512
210
+ size: 168
211
+ dim: 768
212
+ is_external: true
213
+ is_proprietary: false
214
+ is_sentence_transformers_compatible: true
215
+ bert-base-swedish-cased:
216
+ link: https://huggingface.co/KB/bert-base-swedish-cased
217
+ seq_len: 512
218
+ size: 125
219
+ dim: 768
220
+ is_external: true
221
+ is_proprietary: false
222
+ is_sentence_transformers_compatible: true
223
+ bert-base-uncased:
224
+ link: https://huggingface.co/bert-base-uncased
225
+ seq_len: 512
226
+ size: 110
227
+ dim: 768
228
+ is_external: true
229
+ is_proprietary: false
230
+ is_sentence_transformers_compatible: true
231
+ bge-base-zh-v1.5:
232
+ link: https://huggingface.co/BAAI/bge-base-zh-v1.5
233
+ seq_len: 512
234
+ size: 102
235
+ dim: 768
236
+ is_external: true
237
+ is_proprietary: false
238
+ is_sentence_transformers_compatible: true
239
+ bge-large-en-v1.5:
240
+ link: https://huggingface.co/BAAI/bge-large-en-v1.5
241
+ seq_len: 512
242
+ size: null
243
+ dim: 1024
244
+ is_external: true
245
+ is_proprietary: false
246
+ is_sentence_transformers_compatible: false
247
+ bge-large-zh-noinstruct:
248
+ link: https://huggingface.co/BAAI/bge-large-zh-noinstruct
249
+ seq_len: 512
250
+ size: 326
251
+ dim: 1024
252
+ is_external: true
253
+ is_proprietary: false
254
+ is_sentence_transformers_compatible: true
255
+ bge-large-zh-v1.5:
256
+ link: https://huggingface.co/BAAI/bge-large-zh-v1.5
257
+ seq_len: 512
258
+ size: 326
259
+ dim: 1024
260
+ is_external: true
261
+ is_proprietary: false
262
+ is_sentence_transformers_compatible: true
263
+ bge-small-zh-v1.5:
264
+ link: https://huggingface.co/BAAI/bge-small-zh-v1.5
265
+ seq_len: 512
266
+ size: 24
267
+ dim: 512
268
+ is_external: true
269
+ is_proprietary: false
270
+ is_sentence_transformers_compatible: true
271
+ bm25:
272
+ link: https://en.wikipedia.org/wiki/Okapi_BM25
273
+ size: 0
274
+ is_external: true
275
+ is_proprietary: false
276
+ is_sentence_transformers_compatible: false
277
+ camembert-base:
278
+ link: https://huggingface.co/almanach/camembert-base
279
+ seq_len: 512
280
+ size: 111
281
+ dim: 512
282
+ is_external: false
283
+ is_proprietary: false
284
+ is_sentence_transformers_compatible: true
285
+ camembert-large:
286
+ link: https://huggingface.co/almanach/camembert-large
287
+ seq_len: 512
288
+ size: 338
289
+ dim: 768
290
+ is_external: false
291
+ is_proprietary: false
292
+ is_sentence_transformers_compatible: true
293
+ contriever-base-msmarco:
294
+ link: https://huggingface.co/nthakur/contriever-base-msmarco
295
+ seq_len: 512
296
+ size: 110
297
+ dim: 768
298
+ is_external: true
299
+ is_proprietary: false
300
+ is_sentence_transformers_compatible: true
301
+ cross-en-de-roberta-sentence-transformer:
302
+ link: https://huggingface.co/T-Systems-onsite/cross-en-de-roberta-sentence-transformer
303
+ seq_len: 514
304
+ size: 278
305
+ dim: 768
306
+ is_external: true
307
+ is_proprietary: false
308
+ is_sentence_transformers_compatible: true
309
+ dfm-encoder-large-v1:
310
+ link: https://huggingface.co/chcaa/dfm-encoder-large-v1
311
+ seq_len: 512
312
+ size: 355
313
+ dim: 1024
314
+ is_external: true
315
+ is_proprietary: false
316
+ is_sentence_transformers_compatible: true
317
+ dfm-sentence-encoder-large-1:
318
+ link: https://huggingface.co/chcaa/dfm-encoder-large-v1
319
+ seq_len: 512
320
+ size: 355
321
+ dim: 1024
322
+ is_external: true
323
+ is_proprietary: false
324
+ is_sentence_transformers_compatible: true
325
+ distilbert-base-25lang-cased:
326
+ link: https://huggingface.co/Geotrend/distilbert-base-25lang-cased
327
+ seq_len: 512
328
+ size: 110
329
+ dim: 768
330
+ is_external: false
331
+ is_proprietary: false
332
+ is_sentence_transformers_compatible: true
333
+ distilbert-base-en-fr-cased:
334
+ link: https://huggingface.co/Geotrend/distilbert-base-en-fr-cased
335
+ seq_len: 512
336
+ size: 110
337
+ dim: 768
338
+ is_external: false
339
+ is_proprietary: false
340
+ is_sentence_transformers_compatible: true
341
+ distilbert-base-en-fr-es-pt-it-cased:
342
+ link: https://huggingface.co/Geotrend/distilbert-base-en-fr-es-pt-it-cased
343
+ seq_len: 512
344
+ size: 110
345
+ dim: 768
346
+ is_external: false
347
+ is_proprietary: false
348
+ is_sentence_transformers_compatible: true
349
+ distilbert-base-fr-cased:
350
+ link: https://huggingface.co/Geotrend/distilbert-base-fr-cased
351
+ seq_len: 512
352
+ size: 110
353
+ dim: 768
354
+ is_external: false
355
+ is_proprietary: false
356
+ is_sentence_transformers_compatible: true
357
+ distilbert-base-uncased:
358
+ link: https://huggingface.co/distilbert-base-uncased
359
+ seq_len: 512
360
+ size: 110
361
+ dim: 768
362
+ is_external: false
363
+ is_proprietary: false
364
+ is_sentence_transformers_compatible: true
365
+ distiluse-base-multilingual-cased-v2:
366
+ link: https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2
367
+ seq_len: 512
368
+ size: 135
369
+ dim: 512
370
+ is_external: true
371
+ is_proprietary: false
372
+ is_sentence_transformers_compatible: true
373
+ e5-base-v2:
374
+ link: https://huggingface.co/intfloat/e5-base-v2
375
+ seq_len: 512
376
+ size: 110
377
+ dim: 768
378
+ is_external: true
379
+ is_proprietary: false
380
+ is_sentence_transformers_compatible: true
381
+ e5-base:
382
+ link: https://huggingface.co/intfloat/e5-base
383
+ seq_len: 512
384
+ size: 110
385
+ dim: 768
386
+ is_external: true
387
+ is_proprietary: false
388
+ is_sentence_transformers_compatible: true
389
+ e5-large-v2:
390
+ link: https://huggingface.co/intfloat/e5-large-v2
391
+ seq_len: 512
392
+ size: 335
393
+ dim: 1024
394
+ is_external: true
395
+ is_proprietary: false
396
+ is_sentence_transformers_compatible: true
397
+ e5-large:
398
+ link: https://huggingface.co/intfloat/e5-large
399
+ seq_len: 512
400
+ size: 335
401
+ dim: 1024
402
+ is_external: true
403
+ is_proprietary: false
404
+ is_sentence_transformers_compatible: true
405
+ e5-mistral-7b-instruct:
406
+ link: https://huggingface.co/intfloat/e5-mistral-7b-instruct
407
+ seq_len: 32768
408
+ size: 7111
409
+ dim: 4096
410
+ is_external: true
411
+ is_proprietary: false
412
+ is_sentence_transformers_compatible: true
413
+ e5-small:
414
+ link: https://huggingface.co/intfloat/e5-small
415
+ seq_len: 512
416
+ size: 33
417
+ dim: 384
418
+ is_external: true
419
+ is_proprietary: false
420
+ is_sentence_transformers_compatible: true
421
+ electra-small-nordic:
422
+ link: https://huggingface.co/jonfd/electra-small-nordic
423
+ seq_len: 512
424
+ size: 23
425
+ dim: 256
426
+ is_external: true
427
+ is_proprietary: false
428
+ is_sentence_transformers_compatible: true
429
+ electra-small-swedish-cased-discriminator:
430
+ link: https://huggingface.co/KBLab/electra-small-swedish-cased-discriminator
431
+ seq_len: 512
432
+ size: 16
433
+ dim: 256
434
+ is_external: true
435
+ is_proprietary: false
436
+ is_sentence_transformers_compatible: true
437
+ flan-t5-base:
438
+ link: https://huggingface.co/google/flan-t5-base
439
+ seq_len: 512
440
+ size: 220
441
+ dim: -1
442
+ is_external: true
443
+ is_proprietary: false
444
+ is_sentence_transformers_compatible: true
445
+ flan-t5-large:
446
+ link: https://huggingface.co/google/flan-t5-large
447
+ seq_len: 512
448
+ size: 770
449
+ dim: -1
450
+ is_external: true
451
+ is_proprietary: false
452
+ is_sentence_transformers_compatible: true
453
+ flaubert_base_cased:
454
+ link: https://huggingface.co/flaubert/flaubert_base_cased
455
+ seq_len: 512
456
+ size: 138
457
+ dim: 768
458
+ is_external: true
459
+ is_proprietary: false
460
+ is_sentence_transformers_compatible: true
461
+ flaubert_base_uncased:
462
+ link: https://huggingface.co/flaubert/flaubert_base_uncased
463
+ seq_len: 512
464
+ size: 138
465
+ dim: 768
466
+ is_external: true
467
+ is_proprietary: false
468
+ is_sentence_transformers_compatible: true
469
+ flaubert_large_cased:
470
+ link: https://huggingface.co/flaubert/flaubert_large_cased
471
+ seq_len: 512
472
+ size: 372
473
+ dim: 1024
474
+ is_external: true
475
+ is_proprietary: false
476
+ is_sentence_transformers_compatible: true
477
+ gbert-base:
478
+ link: https://huggingface.co/deepset/gbert-base
479
+ seq_len: 512
480
+ size: 110
481
+ dim: 768
482
+ is_external: true
483
+ is_proprietary: false
484
+ is_sentence_transformers_compatible: true
485
+ gbert-large:
486
+ link: https://huggingface.co/deepset/gbert-large
487
+ seq_len: 512
488
+ size: 337
489
+ dim: 1024
490
+ is_external: true
491
+ is_proprietary: false
492
+ is_sentence_transformers_compatible: true
493
+ gelectra-base:
494
+ link: https://huggingface.co/deepset/gelectra-base
495
+ seq_len: 512
496
+ size: 110
497
+ dim: 768
498
+ is_external: true
499
+ is_proprietary: false
500
+ is_sentence_transformers_compatible: true
501
+ gelectra-large:
502
+ link: https://huggingface.co/deepset/gelectra-large
503
+ seq_len: 512
504
+ size: 335
505
+ dim: 1024
506
+ is_external: true
507
+ is_proprietary: false
508
+ is_sentence_transformers_compatible: true
509
+ glove.6B.300d:
510
+ link: https://huggingface.co/sentence-transformers/average_word_embeddings_glove.6B.300d
511
+ seq_len: N/A
512
+ size: 120
513
+ dim: 300
514
+ is_external: true
515
+ is_proprietary: false
516
+ is_sentence_transformers_compatible: true
517
+ google-gecko-256.text-embedding-preview-0409:
518
+ link: https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#latest_models
519
+ seq_len: 2048
520
+ size: 1200
521
+ dim: 256
522
+ is_external: true
523
+ is_proprietary: true
524
+ is_sentence_transformers_compatible: false
525
+ google-gecko.text-embedding-preview-0409:
526
+ link: https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#latest_models
527
+ seq_len: 2048
528
+ size: 1200
529
+ dim: 768
530
+ is_external: true
531
+ is_proprietary: true
532
+ is_sentence_transformers_compatible: false
533
+ gottbert-base:
534
+ link: https://huggingface.co/uklfr/gottbert-base
535
+ seq_len: 512
536
+ size: 127
537
+ dim: 768
538
+ is_external: true
539
+ is_proprietary: false
540
+ is_sentence_transformers_compatible: true
541
+ gtr-t5-base:
542
+ link: https://huggingface.co/sentence-transformers/gtr-t5-base
543
+ seq_len: 512
544
+ size: 110
545
+ dim: 768
546
+ is_external: true
547
+ is_proprietary: false
548
+ is_sentence_transformers_compatible: true
549
+ gtr-t5-large:
550
+ link: https://huggingface.co/sentence-transformers/gtr-t5-large
551
+ seq_len: 512
552
+ size: 168
553
+ dim: 768
554
+ is_external: true
555
+ is_proprietary: false
556
+ is_sentence_transformers_compatible: true
557
+ gtr-t5-xl:
558
+ link: https://huggingface.co/sentence-transformers/gtr-t5-xl
559
+ seq_len: 512
560
+ size: 1240
561
+ dim: 768
562
+ is_external: true
563
+ is_proprietary: false
564
+ is_sentence_transformers_compatible: true
565
+ gtr-t5-xxl:
566
+ link: https://huggingface.co/sentence-transformers/gtr-t5-xxl
567
+ seq_len: 512
568
+ size: 4865
569
+ dim: 768
570
+ is_external: true
571
+ is_proprietary: false
572
+ is_sentence_transformers_compatible: true
573
+ herbert-base-retrieval-v2:
574
+ link: https://huggingface.co/ipipan/herbert-base-retrieval-v2
575
+ seq_len: 514
576
+ size: 125
577
+ dim: 768
578
+ is_external: true
579
+ is_proprietary: false
580
+ is_sentence_transformers_compatible: true
581
+ instructor-base:
582
+ link: https://huggingface.co/hkunlp/instructor-base
583
+ seq_len: N/A
584
+ size: 110
585
+ dim: 768
586
+ is_external: true
587
+ is_proprietary: false
588
+ is_sentence_transformers_compatible: true
589
+ instructor-xl:
590
+ link: https://huggingface.co/hkunlp/instructor-xl
591
+ seq_len: N/A
592
+ size: 1241
593
+ dim: 768
594
+ is_external: true
595
+ is_proprietary: false
596
+ is_sentence_transformers_compatible: true
597
+ komninos:
598
+ link: https://huggingface.co/sentence-transformers/average_word_embeddings_komninos
599
+ seq_len: N/A
600
+ size: 134
601
+ dim: 300
602
+ is_external: true
603
+ is_proprietary: false
604
+ is_sentence_transformers_compatible: true
605
+ llama-2-7b-chat:
606
+ link: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
607
+ seq_len: 4096
608
+ size: 7000
609
+ dim: -1
610
+ is_external: true
611
+ is_proprietary: false
612
+ is_sentence_transformers_compatible: false
613
+ luotuo-bert-medium:
614
+ link: https://huggingface.co/silk-road/luotuo-bert-medium
615
+ seq_len: 512
616
+ size: 328
617
+ dim: 768
618
+ is_external: true
619
+ is_proprietary: false
620
+ is_sentence_transformers_compatible: true
621
+ m3e-base:
622
+ link: https://huggingface.co/moka-ai/m3e-base
623
+ seq_len: 512
624
+ size: 102
625
+ dim: 768
626
+ is_external: true
627
+ is_proprietary: false
628
+ is_sentence_transformers_compatible: true
629
+ m3e-large:
630
+ link: https://huggingface.co/moka-ai/m3e-large
631
+ seq_len: 512
632
+ size: 102
633
+ dim: 768
634
+ is_external: true
635
+ is_proprietary: false
636
+ is_sentence_transformers_compatible: true
637
+ mistral-7b-instruct-v0.2:
638
+ link: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
639
+ seq_len: 4096
640
+ size: 7000
641
+ dim: -1
642
+ is_external: true
643
+ is_proprietary: false
644
+ is_sentence_transformers_compatible: false
645
+ mistral-embed:
646
+ link: https://docs.mistral.ai/guides/embeddings
647
+ seq_len: null
648
+ size: null
649
+ dim: 1024
650
+ is_external: true
651
+ is_proprietary: true
652
+ is_sentence_transformers_compatible: false
653
+ monobert-large-msmarco:
654
+ link: https://huggingface.co/castorini/monobert-large-msmarco
655
+ seq_len: 512
656
+ size: 770
657
+ dim: -1
658
+ is_external: true
659
+ is_proprietary: false
660
+ is_sentence_transformers_compatible: false
661
+ monot5-3b-msmarco-10k:
662
+ link: https://huggingface.co/castorini/monot5-3b-msmarco-10k
663
+ seq_len: 512
664
+ size: 2480
665
+ dim: -1
666
+ is_external: true
667
+ is_proprietary: false
668
+ is_sentence_transformers_compatible: false
669
+ monot5-base-msmarco-10k:
670
+ link: https://huggingface.co/castorini/monot5-base-msmarco-10k
671
+ seq_len: 512
672
+ size: 220
673
+ dim: -1
674
+ is_external: true
675
+ is_proprietary: false
676
+ is_sentence_transformers_compatible: false
677
+ msmarco-bert-co-condensor:
678
+ link: https://huggingface.co/sentence-transformers/msmarco-bert-co-condensor
679
+ seq_len: 512
680
+ size: 110
681
+ dim: 768
682
+ is_external: true
683
+ is_proprietary: false
684
+ is_sentence_transformers_compatible: true
685
+ multi-qa-MiniLM-L6-cos-v1:
686
+ link: https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1
687
+ seq_len: 512
688
+ size: 23
689
+ dim: 384
690
+ is_external: true
691
+ is_proprietary: false
692
+ is_sentence_transformers_compatible: true
693
+ multilingual-e5-base:
694
+ link: https://huggingface.co/intfloat/multilingual-e5-base
695
+ seq_len: 514
696
+ size: 278
697
+ dim: 768
698
+ is_external: true
699
+ is_proprietary: false
700
+ is_sentence_transformers_compatible: true
701
+ multilingual-e5-large:
702
+ link: https://huggingface.co/intfloat/multilingual-e5-large
703
+ seq_len: 514
704
+ size: 560
705
+ dim: 1024
706
+ is_external: true
707
+ is_proprietary: false
708
+ is_sentence_transformers_compatible: true
709
+ multilingual-e5-small:
710
+ link: https://huggingface.co/intfloat/multilingual-e5-small
711
+ seq_len: 512
712
+ size: 118
713
+ dim: 384
714
+ is_external: true
715
+ is_proprietary: false
716
+ is_sentence_transformers_compatible: true
717
+ nb-bert-base:
718
+ link: https://huggingface.co/NbAiLab/nb-bert-base
719
+ seq_len: 512
720
+ size: 179
721
+ dim: 768
722
+ is_external: true
723
+ is_proprietary: false
724
+ is_sentence_transformers_compatible: true
725
+ nb-bert-large:
726
+ link: https://huggingface.co/NbAiLab/nb-bert-large
727
+ seq_len: 512
728
+ size: 355
729
+ dim: 1024
730
+ is_external: true
731
+ is_proprietary: false
732
+ is_sentence_transformers_compatible: true
733
+ nomic-embed-text-v1.5-128:
734
+ link: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5
735
+ seq_len: 8192
736
+ size: 138
737
+ dim: 128
738
+ is_external: true
739
+ is_proprietary: false
740
+ is_sentence_transformers_compatible: true
741
+ nomic-embed-text-v1.5-256:
742
+ link: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5
743
+ seq_len: 8192
744
+ size: 138
745
+ dim: 256
746
+ is_external: true
747
+ is_proprietary: false
748
+ is_sentence_transformers_compatible: true
749
+ nomic-embed-text-v1.5-512:
750
+ link: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5
751
+ seq_len: 8192
752
+ size: 138
753
+ dim: 512
754
+ is_external: true
755
+ is_proprietary: false
756
+ is_sentence_transformers_compatible: true
757
+ nomic-embed-text-v1.5-64:
758
+ link: https://huggingface.co/nomic-ai/nomic-embed-text-v1.5
759
+ seq_len: 8192
760
+ size: 138
761
+ dim: 64
762
+ is_external: true
763
+ is_proprietary: false
764
+ is_sentence_transformers_compatible: true
765
+ norbert3-base:
766
+ link: https://huggingface.co/ltg/norbert3-base
767
+ seq_len: 512
768
+ size: 131
769
+ dim: 768
770
+ is_external: true
771
+ is_proprietary: false
772
+ is_sentence_transformers_compatible: true
773
+ norbert3-large:
774
+ link: https://huggingface.co/ltg/norbert3-large
775
+ seq_len: 512
776
+ size: 368
777
+ dim: 1024
778
+ is_external: true
779
+ is_proprietary: false
780
+ is_sentence_transformers_compatible: true
781
+ paraphrase-multilingual-MiniLM-L12-v2:
782
+ link: https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
783
+ seq_len: 512
784
+ size: 118
785
+ dim: 384
786
+ is_external: true
787
+ is_proprietary: false
788
+ is_sentence_transformers_compatible: true
789
+ paraphrase-multilingual-mpnet-base-v2:
790
+ link: https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2
791
+ seq_len: 514
792
+ size: 278
793
+ dim: 768
794
+ is_external: true
795
+ is_proprietary: false
796
+ is_sentence_transformers_compatible: true
797
+ sentence-bert-swedish-cased:
798
+ link: https://huggingface.co/KBLab/sentence-bert-swedish-cased
799
+ seq_len: 512
800
+ size: 125
801
+ dim: 768
802
+ is_external: true
803
+ is_proprietary: false
804
+ is_sentence_transformers_compatible: true
805
+ sentence-camembert-base:
806
+ link: https://huggingface.co/dangvantuan/sentence-camembert-base
807
+ seq_len: 512
808
+ size: 110
809
+ dim: 768
810
+ is_external: true
811
+ is_proprietary: false
812
+ is_sentence_transformers_compatible: true
813
+ sentence-camembert-large:
814
+ link: https://huggingface.co/dangvantuan/sentence-camembert-large
815
+ seq_len: 512
816
+ size: 337
817
+ dim: 1024
818
+ is_external: true
819
+ is_proprietary: false
820
+ is_sentence_transformers_compatible: true
821
+ sentence-croissant-llm-base:
822
+ link: https://huggingface.co/Wissam42/sentence-croissant-llm-base
823
+ seq_len: 2048
824
+ size: 1280
825
+ dim: 2048
826
+ is_external: true
827
+ is_proprietary: false
828
+ is_sentence_transformers_compatible: true
829
+ sentence-t5-base:
830
+ link: https://huggingface.co/sentence-transformers/sentence-t5-base
831
+ seq_len: 512
832
+ size: 110
833
+ dim: 768
834
+ is_external: true
835
+ is_proprietary: false
836
+ is_sentence_transformers_compatible: true
837
+ sentence-t5-large:
838
+ link: https://huggingface.co/sentence-transformers/sentence-t5-large
839
+ seq_len: 512
840
+ size: 168
841
+ dim: 768
842
+ is_external: true
843
+ is_proprietary: false
844
+ is_sentence_transformers_compatible: true
845
+ sentence-t5-xl:
846
+ link: https://huggingface.co/sentence-transformers/sentence-t5-xl
847
+ seq_len: 512
848
+ size: 1240
849
+ dim: 768
850
+ is_external: true
851
+ is_proprietary: false
852
+ is_sentence_transformers_compatible: true
853
+ sentence-t5-xxl:
854
+ link: https://huggingface.co/sentence-transformers/sentence-t5-xxl
855
+ seq_len: 512
856
+ size: 4865
857
+ dim: 768
858
+ is_external: true
859
+ is_proprietary: false
860
+ is_sentence_transformers_compatible: true
861
+ silver-retriever-base-v1:
862
+ link: https://huggingface.co/ipipan/silver-retriever-base-v1
863
+ seq_len: 514
864
+ size: 125
865
+ dim: 768
866
+ is_external: true
867
+ is_proprietary: false
868
+ is_sentence_transformers_compatible: true
869
+ st-polish-paraphrase-from-distilroberta:
870
+ link: https://huggingface.co/sdadas/st-polish-paraphrase-from-distilroberta
871
+ seq_len: 514
872
+ size: 125
873
+ dim: 768
874
+ is_external: true
875
+ is_proprietary: false
876
+ is_sentence_transformers_compatible: true
877
+ st-polish-paraphrase-from-mpnet:
878
+ link: https://huggingface.co/sdadas/st-polish-paraphrase-from-mpnet
879
+ seq_len: 514
880
+ size: 125
881
+ dim: 768
882
+ is_external: true
883
+ is_proprietary: false
884
+ is_sentence_transformers_compatible: true
885
+ sup-simcse-bert-base-uncased:
886
+ link: https://huggingface.co/princeton-nlp/sup-simcse-bert-base-uncased
887
+ seq_len: 512
888
+ size: 110
889
+ dim: 768
890
+ is_external: true
891
+ is_proprietary: false
892
+ is_sentence_transformers_compatible: true
893
+ text-embedding-3-large:
894
+ link: https://openai.com/blog/new-embedding-models-and-api-updates
895
+ seq_len: 8191
896
+ size: null
897
+ dim: 3072
898
+ is_external: true
899
+ is_proprietary: true
900
+ is_sentence_transformers_compatible: false
901
+ text-embedding-3-large-256:
902
+ link: https://openai.com/blog/new-embedding-models-and-api-updates
903
+ seq_len: 8191
904
+ size: null
905
+ dim: 256
906
+ is_external: true
907
+ is_proprietary: true
908
+ is_sentence_transformers_compatible: false
909
+ text-embedding-3-small:
910
+ link: https://openai.com/blog/new-embedding-models-and-api-updates
911
+ seq_len: 8191
912
+ size: null
913
+ dim: 1536
914
+ is_external: true
915
+ is_proprietary: true
916
+ is_sentence_transformers_compatible: false
917
+ text-embedding-ada-002:
918
+ link: https://openai.com/blog/new-and-improved-embedding-model
919
+ seq_len: 8191
920
+ size: null
921
+ dim: 1536
922
+ is_external: true
923
+ is_proprietary: true
924
+ is_sentence_transformers_compatible: false
925
+ text-search-ada-001:
926
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
927
+ seq_len: 2046
928
+ size: null
929
+ dim: 1024
930
+ is_external: true
931
+ is_proprietary: true
932
+ is_sentence_transformers_compatible: false
933
+ text-search-ada-doc-001:
934
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
935
+ seq_len: 2046
936
+ size: null
937
+ dim: 1024
938
+ is_external: true
939
+ is_proprietary: true
940
+ is_sentence_transformers_compatible: false
941
+ text-search-ada-query-001:
942
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
943
+ seq_len: 2046
944
+ size: null
945
+ dim: 1024
946
+ is_external: false
947
+ is_proprietary: true
948
+ is_sentence_transformers_compatible: false
949
+ text-search-babbage-001:
950
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
951
+ seq_len: 2046
952
+ size: null
953
+ dim: 2048
954
+ is_external: true
955
+ is_proprietary: true
956
+ is_sentence_transformers_compatible: false
957
+ text-search-curie-001:
958
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
959
+ seq_len: 2046
960
+ size: null
961
+ dim: 4096
962
+ is_external: true
963
+ is_proprietary: true
964
+ is_sentence_transformers_compatible: false
965
+ text-search-davinci-001:
966
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
967
+ seq_len: 2046
968
+ size: null
969
+ dim: 12288
970
+ is_external: true
971
+ is_proprietary: true
972
+ is_sentence_transformers_compatible: false
973
+ text-similarity-ada-001:
974
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
975
+ seq_len: 2046
976
+ size: null
977
+ dim: 1024
978
+ is_external: true
979
+ is_proprietary: true
980
+ is_sentence_transformers_compatible: false
981
+ text-similarity-babbage-001:
982
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
983
+ seq_len: 2046
984
+ size: null
985
+ dim: 2048
986
+ is_external: true
987
+ is_proprietary: true
988
+ is_sentence_transformers_compatible: false
989
+ text-similarity-curie-001:
990
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
991
+ seq_len: 2046
992
+ size: null
993
+ dim: 4096
994
+ is_external: true
995
+ is_proprietary: true
996
+ is_sentence_transformers_compatible: false
997
+ text-similarity-davinci-001:
998
+ link: https://openai.com/blog/introducing-text-and-code-embeddings
999
+ seq_len: 2046
1000
+ size: null
1001
+ dim: 12288
1002
+ is_external: true
1003
+ is_proprietary: true
1004
+ is_sentence_transformers_compatible: false
1005
+ tart-dual-contriever-msmarco:
1006
+ link: https://huggingface.co/orionweller/tart-dual-contriever-msmarco
1007
+ seq_len: 512
1008
+ size: 110
1009
+ dim: 768
1010
+ is_external: true
1011
+ is_proprietary: false
1012
+ is_sentence_transformers_compatible: false
1013
+ tart-full-flan-t5-xl:
1014
+ link: https://huggingface.co/facebook/tart-full-flan-t5-xl
1015
+ seq_len: 512
1016
+ size: 2480
1017
+ dim: -1
1018
+ is_external: true
1019
+ is_proprietary: false
1020
+ is_sentence_transformers_compatible: false
1021
+ text2vec-base-chinese:
1022
+ link: https://huggingface.co/shibing624/text2vec-base-chinese
1023
+ seq_len: 512
1024
+ size: 102
1025
+ dim: 768
1026
+ is_external: true
1027
+ is_proprietary: false
1028
+ is_sentence_transformers_compatible: true
1029
+ text2vec-base-multilingual:
1030
+ link: null
1031
+ seq_len: null
1032
+ size: null
1033
+ dim: null
1034
+ is_external: true
1035
+ is_proprietary: false
1036
+ is_sentence_transformers_compatible: false
1037
+ text2vec-large-chinese:
1038
+ link: https://huggingface.co/GanymedeNil/text2vec-large-chinese
1039
+ seq_len: 512
1040
+ size: 326
1041
+ dim: 1024
1042
+ is_external: true
1043
+ is_proprietary: false
1044
+ is_sentence_transformers_compatible: true
1045
+ titan-embed-text-v1:
1046
+ link: https://docs.aws.amazon.com/bedrock/latest/userguide/embeddings.html
1047
+ seq_len: 8000
1048
+ size: null
1049
+ dim: 1536
1050
+ is_external: true
1051
+ is_proprietary: true
1052
+ is_sentence_transformers_compatible: false
1053
+ udever-bloom-1b1:
1054
+ link: https://huggingface.co/izhx/udever-bloom-1b1
1055
+ seq_len: 2048
1056
+ size: null
1057
+ dim: 1536
1058
+ is_external: true
1059
+ is_proprietary: false
1060
+ is_sentence_transformers_compatible: true
1061
+ udever-bloom-560m:
1062
+ link: https://huggingface.co/izhx/udever-bloom-560m
1063
+ seq_len: 2048
1064
+ size: null
1065
+ dim: 1024
1066
+ is_external: true
1067
+ is_proprietary: false
1068
+ is_sentence_transformers_compatible: true
1069
+ universal-sentence-encoder-multilingual-3:
1070
+ link: https://huggingface.co/vprelovac/universal-sentence-encoder-multilingual-3
1071
+ seq_len: 512
1072
+ size: null
1073
+ dim: 512
1074
+ is_external: true
1075
+ is_proprietary: false
1076
+ is_sentence_transformers_compatible: true
1077
+ universal-sentence-encoder-multilingual-large-3:
1078
+ link: https://huggingface.co/vprelovac/universal-sentence-encoder-multilingual-large-3
1079
+ seq_len: 512
1080
+ size: null
1081
+ dim: 512
1082
+ is_external: true
1083
+ is_proprietary: false
1084
+ is_sentence_transformers_compatible: true
1085
+ unsup-simcse-bert-base-uncased:
1086
+ link: https://huggingface.co/princeton-nlp/unsup-simcse-bert-base-uncased
1087
+ seq_len: 512
1088
+ size: 110
1089
+ dim: 768
1090
+ is_external: true
1091
+ is_proprietary: false
1092
+ is_sentence_transformers_compatible: true
1093
+ use-cmlm-multilingual:
1094
+ link: https://huggingface.co/sentence-transformers/use-cmlm-multilingual
1095
+ seq_len: 512
1096
+ size: 472
1097
+ dim: 768
1098
+ is_external: true
1099
+ is_proprietary: false
1100
+ is_sentence_transformers_compatible: true
1101
+ voyage-2:
1102
+ link: https://docs.voyageai.com/embeddings/
1103
+ seq_len: 1024
1104
+ size: null
1105
+ dim: 1024
1106
+ is_external: true
1107
+ is_proprietary: true
1108
+ is_sentence_transformers_compatible: false
1109
+ voyage-code-2:
1110
+ link: https://docs.voyageai.com/embeddings/
1111
+ seq_len: 16000
1112
+ size: null
1113
+ dim: 1536
1114
+ is_external: true
1115
+ is_proprietary: true
1116
+ is_sentence_transformers_compatible: false
1117
+ voyage-large-2-instruct:
1118
+ link: https://docs.voyageai.com/embeddings/
1119
+ seq_len: 16000
1120
+ size: null
1121
+ dim: 1024
1122
+ is_external: true
1123
+ is_proprietary: false
1124
+ is_sentence_transformers_compatible: false
1125
+ voyage-law-2:
1126
+ link: https://docs.voyageai.com/embeddings/
1127
+ seq_len: 4000
1128
+ size: null
1129
+ dim: 1024
1130
+ is_external: true
1131
+ is_proprietary: true
1132
+ is_sentence_transformers_compatible: false
1133
+ voyage-lite-01-instruct:
1134
+ link: https://docs.voyageai.com/embeddings/
1135
+ seq_len: 4000
1136
+ size: null
1137
+ dim: 1024
1138
+ is_external: true
1139
+ is_proprietary: true
1140
+ is_sentence_transformers_compatible: false
1141
+ voyage-lite-02-instruct:
1142
+ link: https://docs.voyageai.com/embeddings/
1143
+ seq_len: 4000
1144
+ size: 1220
1145
+ dim: 1024
1146
+ is_external: true
1147
+ is_proprietary: true
1148
+ is_sentence_transformers_compatible: false
1149
+ xlm-roberta-base:
1150
+ link: https://huggingface.co/xlm-roberta-base
1151
+ seq_len: 514
1152
+ size: 279
1153
+ dim: 768
1154
+ is_external: true
1155
+ is_proprietary: false
1156
+ is_sentence_transformers_compatible: true
1157
+ xlm-roberta-large:
1158
+ link: https://huggingface.co/xlm-roberta-large
1159
+ seq_len: 514
1160
+ size: 560
1161
+ dim: 1024
1162
+ is_external: true
1163
+ is_proprietary: false
1164
+ is_sentence_transformers_compatible: true
1165
+ models_to_skip:
1166
+ - michaelfeil/ct2fast-e5-large-v2
1167
+ - McGill-NLP/LLM2Vec-Sheared-LLaMA-mntp-unsup-simcse
1168
+ - newsrx/instructor-xl
1169
+ - sionic-ai/sionic-ai-v1
1170
+ - lsf1000/bge-evaluation
1171
+ - Intel/bge-small-en-v1.5-sst2
1172
+ - newsrx/instructor-xl-newsrx
1173
+ - McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-unsup-simcse
1174
+ - McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-unsup-simcse
1175
+ - davidpeer/gte-small
1176
+ - goldenrooster/multilingual-e5-large
1177
+ - kozistr/fused-large-en
1178
+ - mixamrepijey/instructor-small
1179
+ - McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-supervised
1180
+ - DecisionOptimizationSystem/DeepFeatEmbeddingLargeContext
1181
+ - Intel/bge-base-en-v1.5-sst2-int8-dynamic
1182
+ - morgendigital/multilingual-e5-large-quantized
1183
+ - BAAI/bge-small-en
1184
+ - ggrn/e5-small-v2
1185
+ - vectoriseai/gte-small
1186
+ - giulio98/placeholder
1187
+ - odunola/UAE-Large-VI
1188
+ - vectoriseai/e5-large-v2
1189
+ - gruber/e5-small-v2-ggml
1190
+ - Severian/nomic
1191
+ - arcdev/e5-mistral-7b-instruct
1192
+ - mlx-community/multilingual-e5-base-mlx
1193
+ - michaelfeil/ct2fast-bge-base-en-v1.5
1194
+ - Intel/bge-small-en-v1.5-sst2-int8-static
1195
+ - jncraton/stella-base-en-v2-ct2-int8
1196
+ - vectoriseai/multilingual-e5-large
1197
+ - rlsChapters/Chapters-SFR-Embedding-Mistral
1198
+ - arcdev/SFR-Embedding-Mistral
1199
+ - McGill-NLP/LLM2Vec-Mistral-7B-Instruct-v2-mntp-supervised
1200
+ - McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp-supervised
1201
+ - vectoriseai/gte-base
1202
+ - mixamrepijey/instructor-models
1203
+ - GovCompete/e5-large-v2
1204
+ - ef-zulla/e5-multi-sml-torch
1205
+ - khoa-klaytn/bge-small-en-v1.5-angle
1206
+ - krilecy/e5-mistral-7b-instruct
1207
+ - vectoriseai/bge-base-en-v1.5
1208
+ - vectoriseai/instructor-base
1209
+ - jingyeom/korean_embedding_model
1210
+ - rizki/bgr-tf
1211
+ - barisaydin/bge-base-en
1212
+ - jamesgpt1/zzz
1213
+ - Malmuk1/e5-large-v2_Sharded
1214
+ - vectoriseai/ember-v1
1215
+ - Consensus/instructor-base
1216
+ - barisaydin/bge-small-en
1217
+ - barisaydin/gte-base
1218
+ - woody72/multilingual-e5-base
1219
+ - Einas/einas_ashkar
1220
+ - michaelfeil/ct2fast-bge-large-en-v1.5
1221
+ - vectoriseai/bge-small-en-v1.5
1222
+ - iampanda/Test
1223
+ - cherubhao/yogamodel
1224
+ - ieasybooks/multilingual-e5-large-onnx
1225
+ - jncraton/e5-small-v2-ct2-int8
1226
+ - radames/e5-large
1227
+ - khoa-klaytn/bge-base-en-v1.5-angle
1228
+ - Intel/bge-base-en-v1.5-sst2-int8-static
1229
+ - vectoriseai/e5-large
1230
+ - TitanML/jina-v2-base-en-embed
1231
+ - Koat/gte-tiny
1232
+ - binqiangliu/EmbeddingModlebgelargeENv1.5
1233
+ - beademiguelperez/sentence-transformers-multilingual-e5-small
1234
+ - sionic-ai/sionic-ai-v2
1235
+ - jamesdborin/jina-v2-base-en-embed
1236
+ - maiyad/multilingual-e5-small
1237
+ - dmlls/all-mpnet-base-v2
1238
+ - odunola/e5-base-v2
1239
+ - vectoriseai/bge-large-en-v1.5
1240
+ - vectoriseai/bge-small-en
1241
+ - karrar-alwaili/UAE-Large-V1
1242
+ - t12e/instructor-base
1243
+ - Frazic/udever-bloom-3b-sentence
1244
+ - Geolumina/instructor-xl
1245
+ - hsikchi/dump
1246
+ - recipe/embeddings
1247
+ - michaelfeil/ct2fast-bge-small-en-v1.5
1248
+ - ildodeltaRule/multilingual-e5-large
1249
+ - shubham-bgi/UAE-Large
1250
+ - BAAI/bge-large-en
1251
+ - michaelfeil/ct2fast-e5-small-v2
1252
+ - cgldo/semanticClone
1253
+ - barisaydin/gte-small
1254
+ - aident-ai/bge-base-en-onnx
1255
+ - jamesgpt1/english-large-v1
1256
+ - michaelfeil/ct2fast-e5-small
1257
+ - baseplate/instructor-large-1
1258
+ - newsrx/instructor-large
1259
+ - Narsil/bge-base-en
1260
+ - michaelfeil/ct2fast-e5-large
1261
+ - mlx-community/multilingual-e5-small-mlx
1262
+ - lightbird-ai/nomic
1263
+ - MaziyarPanahi/GritLM-8x7B-GGUF
1264
+ - newsrx/instructor-large-newsrx
1265
+ - dhairya0907/thenlper-get-large
1266
+ - barisaydin/bge-large-en
1267
+ - jncraton/bge-small-en-ct2-int8
1268
+ - retrainai/instructor-xl
1269
+ - BAAI/bge-base-en
1270
+ - gentlebowl/instructor-large-safetensors
1271
+ - d0rj/e5-large-en-ru
1272
+ - atian-chapters/Chapters-SFR-Embedding-Mistral
1273
+ - Intel/bge-base-en-v1.5-sts-int8-static
1274
+ - Intel/bge-base-en-v1.5-sts-int8-dynamic
1275
+ - jncraton/GIST-small-Embedding-v0-ct2-int8
1276
+ - jncraton/gte-tiny-ct2-int8
1277
+ - d0rj/e5-small-en-ru
1278
+ - vectoriseai/e5-small-v2
1279
+ - SmartComponents/bge-micro-v2
1280
+ - michaelfeil/ct2fast-gte-base
1281
+ - vectoriseai/e5-base-v2
1282
+ - Intel/bge-base-en-v1.5-sst2
1283
+ - McGill-NLP/LLM2Vec-Sheared-LLaMA-mntp-supervised
1284
+ - Research2NLP/electrical_stella
1285
+ - weakit-v/bge-base-en-v1.5-onnx
1286
+ - GovCompete/instructor-xl
1287
+ - barisaydin/text2vec-base-multilingual
1288
+ - Intel/bge-small-en-v1.5-sst2-int8-dynamic
1289
+ - jncraton/gte-small-ct2-int8
1290
+ - d0rj/e5-base-en-ru
1291
+ - barisaydin/gte-large
1292
+ - fresha/e5-large-v2-endpoint
1293
+ - vectoriseai/instructor-large
1294
+ - Severian/embed
1295
+ - vectoriseai/e5-base
1296
+ - mlx-community/multilingual-e5-large-mlx
1297
+ - vectoriseai/gte-large
1298
+ - anttip/ct2fast-e5-small-v2-hfie
1299
+ - michaelfeil/ct2fast-gte-large
1300
+ - gizmo-ai/Cohere-embed-multilingual-v3.0
1301
+ - McGill-NLP/LLM2Vec-Llama-2-7b-chat-hf-mntp-unsup-simcse
1302
+ cross_encoders:
1303
+ - FollowIR-7B
1304
+ - flan-t5-base
1305
+ - flan-t5-large
1306
+ - monobert-large-msmarco
1307
+ - monot5-3b-msmarco-10k
1308
+ - monot5-base-msmarco-10