File size: 10,709 Bytes
f8a1c6c
 
 
e228e4f
eb69b61
 
e228e4f
 
 
 
 
8df4d1c
 
 
eb69b61
8df4d1c
addecd1
8df4d1c
eb69b61
8df4d1c
 
e228e4f
8df4d1c
d81ea01
 
963b222
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d81ea01
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
license: eupl-1.1
---

👷‍♂️ Work in progress 


# EUBERT Embedding v1

Based on the masked language model EUBERT this sentence transformer will allow to compute embeddings on various EU documents in 24 languages. 

- Number of dimensions: 768
- Pre-trained model: EUBERT
- Finetuned dataset: AllNLI

```python
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('EuropeanParliament/eubert_embedding_v1')

vector = model.encode("Based on the masked language model EUBERT this sentence transformer will allow to compute embeddings on various EU documents in 24 languages.")
```

Evaluation and benchmarking are welcome



| task | dataset | name | config | split | revision | accuracy | ap | f1 | v_measure |
|-|-|-|-|-|-|-|-|-|-|
| Classification | mteb/amazon_counterfactual | MTEB AmazonCounterfactualClassification (en) | en | test | e8379541af4e31359cca9fbcf4b00f2671dba205 | 65.46268656716417 | 28.448646125211685 | 59.381505835828655 |  |
| Classification | mteb/amazon_polarity | MTEB AmazonPolarityClassification | default | test | e2d317d38cd51312af73b3d32a06d1a08b442046 | 66.46035 | 61.29404861567824 | 66.33660156778977 |  |
| Classification | mteb/amazon_reviews_multi | MTEB AmazonReviewsClassification (en) | en | test | 1399c76144fd37290681b995c656ef9b2e06e26d | 33.002 |  | 32.703439998458286 |  |
| Clustering | mteb/arxiv-clustering-p2p | MTEB ArxivClusteringP2P | default | test | a122ad7f3f0291bf49cc6f4d32aa80929df69d5d |  |  |  | 26.726296122407874 |
| Classification | mteb/banking77 | MTEB Banking77Classification | default | test | 0fd18e25b25c072e09e0d92ab615fda904d66300 | 72.07792207792207 |  | 72.00698905672714 |  |
| Classification | mteb/emotion | MTEB EmotionClassification | default | test | 4f58c6b202a23cf9a4da393831edf4f9183cad37 | 25.45 |  | 22.489051015009604 |  |  
| Classification | mteb/imdb | MTEB ImdbClassification | default | test | 3d86128a09e091d6018b6d26cad27f2739fc2db7 | 61.0288 | 56.84210754735158 | 60.72244426285243 |  |
| Classification | mteb/mtop_domain | MTEB MTOPDomainClassification (en) | en | test | d80d48c1eb48d3562165c59d59d0034df9fff0bf | 78.63657090743274 |  | 77.33756273016937 |  |
| Classification | mteb/mtop_domain | MTEB MTOPDomainClassification (de) | de | test | d80d48c1eb48d3562165c59d59d0034df9fff0bf | 67.63313609467455 |  | 65.31424834681424 |  |
| Classification | mteb/mtop_domain | MTEB MTOPDomainClassification (es) | es | test | d80d48c1eb48d3562165c59d59d0034df9fff0bf | 72.03468979319545 |  | 70.33858350063844 |  |
| Classification | mteb/mtop_domain | MTEB MTOPDomainClassification (fr) | fr | test | d80d48c1eb48d3562165c59d59d0034df9fff0bf | 69.33604760413404 |  | 67.2763398514464 |  |  
| Classification | mteb/mtop_domain | MTEB MTOPDomainClassification (hi) | hi | test | d80d48c1eb48d3562165c59d59d0034df9fff0bf | 19.336679813553243 |  | 17.640206592911305 |  |
| Classification | mteb/mtop_domain | MTEB MTOPDomainClassification (th) | th | test | d80d48c1eb48d3562165c59d59d0034df9fff0bf | 14.958408679927668 |  | 12.200892995648038 |  |
| Classification | mteb/mtop_intent | MTEB MTOPIntentClassification (en) | en | test | ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | 53.504331965344285 |  | 37.650916452762054 |  |  
| Classification | mteb/mtop_intent | MTEB MTOPIntentClassification (de) | de | test | ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | 52.8007889546351 |  | 35.18483837593346 |  |
| Classification | mteb/mtop_intent | MTEB MTOPIntentClassification (es) | es | test | ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | 53.268845897264846 |  | 37.54041476398511 |  |
| Classification | mteb/mtop_intent | MTEB MTOPIntentClassification (fr) | fr | test | ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | 47.59160663952396 |  | 33.779636915265606 |  |
| Classification | mteb/mtop_intent | MTEB MTOPIntentClassification (hi) | hi | test | ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | 4.180709931875224 |  | 2.240473672484894 |  |  
| Classification | mteb/mtop_intent | MTEB MTOPIntentClassification (th) | th | test | ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | 4.1482820976491865 |  | 2.2953415174353546 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (af) | af | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 43.843308675184936 |  | 42.83274171307546 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (am) | am | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 8.459986550100874 |  | 8.56499841559428 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (ar) | ar | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 24.37457969065232 |  | 23.648464353469087 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (az) | az | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 43.61129791526564 |  | 43.02872726206446 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (bn) | bn | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 3.127101546738399 |  | 1.7632874555194573 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (cy) | cy | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 39.882313382649635 |  | 39.09054995553107 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (da) | da | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 49.05514458641561 |  | 47.97657474719148 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (de) | de | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 47.723604572965705 |  | 46.266605736862424 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (el) | el | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 49.2871553463349 |  | 49.110660419740945 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (en) | en | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 54.80833893745797 |  | 53.43307984316261 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (es) | es | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 48.73234700739745 |  | 48.290537885757345 |  | 
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (fa) | fa | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 34.60322797579018 |  | 33.21866171174647 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (fi) | fi | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 47.09818426361803 |  | 46.24034140543536 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (fr) | fr | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 47.92871553463349 |  | 47.2879827826325 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (he) | he | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 23.429724277067923 |  | 22.973698726459283 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (hi) | hi | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 3.1909885675857437 |  | 2.343483452751791 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (hu) | hu | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 46.529926025554815 |  | 45.585210075220026 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (hy) | hy | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 33.00605245460659 |  | 32.53906554922222 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (id) | id | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 44.70073974445191 |  | 44.63772874280639 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (is) | is | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 42.56556825823806 |  | 42.09519069412614 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (it) | it | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 49.45191661062542 |  | 49.73648735452711 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (ja) | ja | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 36.03227975790181 |  | 34.81337003018146 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (jv) | jv | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 39.85205110961668 |  | 39.16645932365053 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (ka) | ka | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 29.84532616005381 |  | 30.048107009813975 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (km) | km | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 5.4942837928715536 |  | 3.9402294020821236 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (kn) | kn | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 3.5541358439811694 |  | 2.3408708229868385 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (ko) | ko | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 31.055817081371888 |  | 30.54791134524761 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (lv) | lv | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 48.44989912575656 |  | 47.46077758238515 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (ml) | ml | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 2.89172831203766 |  | 1.1484871860887453 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (mn) | mn | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 38.924008069939475 |  | 38.953938082398274 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (ms) | ms | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 43.25151311365165 |  | 42.31124560201582 |  |  
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (my) | my | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 3.5137861466039007 |  | 1.7087643302156377 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (nb) | nb | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 45.34633490248823 |  | 44.7188441016561 |  |
| Classification | mteb/amazon_massive_intent | MTEB MassiveIntentClassification (nl) | nl | test | 31efe3c427b0bae9c22cbb560b8f15491cc6bed7 | 47.25285810356422 |  | 45.442034061197944 |  |  


Author : sebastien.campion@europarl.europa.eu

Contributor(s): 
- Dominik Skotarczak (benchmark)