ImranzamanML commited on
Commit
1447e34
·
verified ·
1 Parent(s): 04114d1

Upload 13 files

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md CHANGED
@@ -1,3 +1,3271 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - feature-extraction
5
+ - sentence-similarity
6
+ - mteb
7
+ - transformers
8
+ - transformers.js
9
+ language:
10
+ - de
11
+ - en
12
+ inference: false
13
+ license: apache-2.0
14
+ model-index:
15
+ - name: jina-embeddings-v2-base-de
16
+ results:
17
+ - task:
18
+ type: Classification
19
+ dataset:
20
+ type: mteb/amazon_counterfactual
21
+ name: MTEB AmazonCounterfactualClassification (en)
22
+ config: en
23
+ split: test
24
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
25
+ metrics:
26
+ - type: accuracy
27
+ value: 73.76119402985076
28
+ - type: ap
29
+ value: 35.99577188521176
30
+ - type: f1
31
+ value: 67.50397431543269
32
+ - task:
33
+ type: Classification
34
+ dataset:
35
+ type: mteb/amazon_counterfactual
36
+ name: MTEB AmazonCounterfactualClassification (de)
37
+ config: de
38
+ split: test
39
+ revision: e8379541af4e31359cca9fbcf4b00f2671dba205
40
+ metrics:
41
+ - type: accuracy
42
+ value: 68.9186295503212
43
+ - type: ap
44
+ value: 79.73307115840507
45
+ - type: f1
46
+ value: 66.66245744831339
47
+ - task:
48
+ type: Classification
49
+ dataset:
50
+ type: mteb/amazon_polarity
51
+ name: MTEB AmazonPolarityClassification
52
+ config: default
53
+ split: test
54
+ revision: e2d317d38cd51312af73b3d32a06d1a08b442046
55
+ metrics:
56
+ - type: accuracy
57
+ value: 77.52215
58
+ - type: ap
59
+ value: 71.85051037177416
60
+ - type: f1
61
+ value: 77.4171096157774
62
+ - task:
63
+ type: Classification
64
+ dataset:
65
+ type: mteb/amazon_reviews_multi
66
+ name: MTEB AmazonReviewsClassification (en)
67
+ config: en
68
+ split: test
69
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
70
+ metrics:
71
+ - type: accuracy
72
+ value: 38.498
73
+ - type: f1
74
+ value: 38.058193386555956
75
+ - task:
76
+ type: Classification
77
+ dataset:
78
+ type: mteb/amazon_reviews_multi
79
+ name: MTEB AmazonReviewsClassification (de)
80
+ config: de
81
+ split: test
82
+ revision: 1399c76144fd37290681b995c656ef9b2e06e26d
83
+ metrics:
84
+ - type: accuracy
85
+ value: 37.717999999999996
86
+ - type: f1
87
+ value: 37.22674371574757
88
+ - task:
89
+ type: Retrieval
90
+ dataset:
91
+ type: arguana
92
+ name: MTEB ArguAna
93
+ config: default
94
+ split: test
95
+ revision: None
96
+ metrics:
97
+ - type: map_at_1
98
+ value: 25.319999999999997
99
+ - type: map_at_10
100
+ value: 40.351
101
+ - type: map_at_100
102
+ value: 41.435
103
+ - type: map_at_1000
104
+ value: 41.443000000000005
105
+ - type: map_at_3
106
+ value: 35.266
107
+ - type: map_at_5
108
+ value: 37.99
109
+ - type: mrr_at_1
110
+ value: 25.746999999999996
111
+ - type: mrr_at_10
112
+ value: 40.515
113
+ - type: mrr_at_100
114
+ value: 41.606
115
+ - type: mrr_at_1000
116
+ value: 41.614000000000004
117
+ - type: mrr_at_3
118
+ value: 35.42
119
+ - type: mrr_at_5
120
+ value: 38.112
121
+ - type: ndcg_at_1
122
+ value: 25.319999999999997
123
+ - type: ndcg_at_10
124
+ value: 49.332
125
+ - type: ndcg_at_100
126
+ value: 53.909
127
+ - type: ndcg_at_1000
128
+ value: 54.089
129
+ - type: ndcg_at_3
130
+ value: 38.705
131
+ - type: ndcg_at_5
132
+ value: 43.606
133
+ - type: precision_at_1
134
+ value: 25.319999999999997
135
+ - type: precision_at_10
136
+ value: 7.831
137
+ - type: precision_at_100
138
+ value: 0.9820000000000001
139
+ - type: precision_at_1000
140
+ value: 0.1
141
+ - type: precision_at_3
142
+ value: 16.24
143
+ - type: precision_at_5
144
+ value: 12.119
145
+ - type: recall_at_1
146
+ value: 25.319999999999997
147
+ - type: recall_at_10
148
+ value: 78.307
149
+ - type: recall_at_100
150
+ value: 98.222
151
+ - type: recall_at_1000
152
+ value: 99.57300000000001
153
+ - type: recall_at_3
154
+ value: 48.72
155
+ - type: recall_at_5
156
+ value: 60.597
157
+ - task:
158
+ type: Clustering
159
+ dataset:
160
+ type: mteb/arxiv-clustering-p2p
161
+ name: MTEB ArxivClusteringP2P
162
+ config: default
163
+ split: test
164
+ revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
165
+ metrics:
166
+ - type: v_measure
167
+ value: 41.43100588255654
168
+ - task:
169
+ type: Clustering
170
+ dataset:
171
+ type: mteb/arxiv-clustering-s2s
172
+ name: MTEB ArxivClusteringS2S
173
+ config: default
174
+ split: test
175
+ revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
176
+ metrics:
177
+ - type: v_measure
178
+ value: 32.08988904593667
179
+ - task:
180
+ type: Reranking
181
+ dataset:
182
+ type: mteb/askubuntudupquestions-reranking
183
+ name: MTEB AskUbuntuDupQuestions
184
+ config: default
185
+ split: test
186
+ revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
187
+ metrics:
188
+ - type: map
189
+ value: 60.55514765595906
190
+ - type: mrr
191
+ value: 73.51393835465858
192
+ - task:
193
+ type: STS
194
+ dataset:
195
+ type: mteb/biosses-sts
196
+ name: MTEB BIOSSES
197
+ config: default
198
+ split: test
199
+ revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
200
+ metrics:
201
+ - type: cos_sim_pearson
202
+ value: 79.6723823121172
203
+ - type: cos_sim_spearman
204
+ value: 76.90596922214986
205
+ - type: euclidean_pearson
206
+ value: 77.87910737957918
207
+ - type: euclidean_spearman
208
+ value: 76.66319260598262
209
+ - type: manhattan_pearson
210
+ value: 77.37039493457965
211
+ - type: manhattan_spearman
212
+ value: 76.09872191280964
213
+ - task:
214
+ type: BitextMining
215
+ dataset:
216
+ type: mteb/bucc-bitext-mining
217
+ name: MTEB BUCC (de-en)
218
+ config: de-en
219
+ split: test
220
+ revision: d51519689f32196a32af33b075a01d0e7c51e252
221
+ metrics:
222
+ - type: accuracy
223
+ value: 98.97703549060543
224
+ - type: f1
225
+ value: 98.86569241475296
226
+ - type: precision
227
+ value: 98.81002087682673
228
+ - type: recall
229
+ value: 98.97703549060543
230
+ - task:
231
+ type: Classification
232
+ dataset:
233
+ type: mteb/banking77
234
+ name: MTEB Banking77Classification
235
+ config: default
236
+ split: test
237
+ revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
238
+ metrics:
239
+ - type: accuracy
240
+ value: 83.93506493506493
241
+ - type: f1
242
+ value: 83.91014949949302
243
+ - task:
244
+ type: Clustering
245
+ dataset:
246
+ type: mteb/biorxiv-clustering-p2p
247
+ name: MTEB BiorxivClusteringP2P
248
+ config: default
249
+ split: test
250
+ revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
251
+ metrics:
252
+ - type: v_measure
253
+ value: 34.970675877585144
254
+ - task:
255
+ type: Clustering
256
+ dataset:
257
+ type: mteb/biorxiv-clustering-s2s
258
+ name: MTEB BiorxivClusteringS2S
259
+ config: default
260
+ split: test
261
+ revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
262
+ metrics:
263
+ - type: v_measure
264
+ value: 28.779230269190954
265
+ - task:
266
+ type: Clustering
267
+ dataset:
268
+ type: slvnwhrl/blurbs-clustering-p2p
269
+ name: MTEB BlurbsClusteringP2P
270
+ config: default
271
+ split: test
272
+ revision: a2dd5b02a77de3466a3eaa98ae586b5610314496
273
+ metrics:
274
+ - type: v_measure
275
+ value: 35.490175601567216
276
+ - task:
277
+ type: Clustering
278
+ dataset:
279
+ type: slvnwhrl/blurbs-clustering-s2s
280
+ name: MTEB BlurbsClusteringS2S
281
+ config: default
282
+ split: test
283
+ revision: 9bfff9a7f8f6dc6ffc9da71c48dd48b68696471d
284
+ metrics:
285
+ - type: v_measure
286
+ value: 16.16638280560168
287
+ - task:
288
+ type: Retrieval
289
+ dataset:
290
+ type: BeIR/cqadupstack
291
+ name: MTEB CQADupstackAndroidRetrieval
292
+ config: default
293
+ split: test
294
+ revision: None
295
+ metrics:
296
+ - type: map_at_1
297
+ value: 30.830999999999996
298
+ - type: map_at_10
299
+ value: 41.355
300
+ - type: map_at_100
301
+ value: 42.791000000000004
302
+ - type: map_at_1000
303
+ value: 42.918
304
+ - type: map_at_3
305
+ value: 38.237
306
+ - type: map_at_5
307
+ value: 40.066
308
+ - type: mrr_at_1
309
+ value: 38.484
310
+ - type: mrr_at_10
311
+ value: 47.593
312
+ - type: mrr_at_100
313
+ value: 48.388
314
+ - type: mrr_at_1000
315
+ value: 48.439
316
+ - type: mrr_at_3
317
+ value: 45.279
318
+ - type: mrr_at_5
319
+ value: 46.724
320
+ - type: ndcg_at_1
321
+ value: 38.484
322
+ - type: ndcg_at_10
323
+ value: 47.27
324
+ - type: ndcg_at_100
325
+ value: 52.568000000000005
326
+ - type: ndcg_at_1000
327
+ value: 54.729000000000006
328
+ - type: ndcg_at_3
329
+ value: 43.061
330
+ - type: ndcg_at_5
331
+ value: 45.083
332
+ - type: precision_at_1
333
+ value: 38.484
334
+ - type: precision_at_10
335
+ value: 8.927
336
+ - type: precision_at_100
337
+ value: 1.425
338
+ - type: precision_at_1000
339
+ value: 0.19
340
+ - type: precision_at_3
341
+ value: 20.791999999999998
342
+ - type: precision_at_5
343
+ value: 14.85
344
+ - type: recall_at_1
345
+ value: 30.830999999999996
346
+ - type: recall_at_10
347
+ value: 57.87799999999999
348
+ - type: recall_at_100
349
+ value: 80.124
350
+ - type: recall_at_1000
351
+ value: 94.208
352
+ - type: recall_at_3
353
+ value: 45.083
354
+ - type: recall_at_5
355
+ value: 51.154999999999994
356
+ - task:
357
+ type: Retrieval
358
+ dataset:
359
+ type: BeIR/cqadupstack
360
+ name: MTEB CQADupstackEnglishRetrieval
361
+ config: default
362
+ split: test
363
+ revision: None
364
+ metrics:
365
+ - type: map_at_1
366
+ value: 25.782
367
+ - type: map_at_10
368
+ value: 34.492
369
+ - type: map_at_100
370
+ value: 35.521
371
+ - type: map_at_1000
372
+ value: 35.638
373
+ - type: map_at_3
374
+ value: 31.735999999999997
375
+ - type: map_at_5
376
+ value: 33.339
377
+ - type: mrr_at_1
378
+ value: 32.357
379
+ - type: mrr_at_10
380
+ value: 39.965
381
+ - type: mrr_at_100
382
+ value: 40.644000000000005
383
+ - type: mrr_at_1000
384
+ value: 40.695
385
+ - type: mrr_at_3
386
+ value: 37.739
387
+ - type: mrr_at_5
388
+ value: 39.061
389
+ - type: ndcg_at_1
390
+ value: 32.357
391
+ - type: ndcg_at_10
392
+ value: 39.644
393
+ - type: ndcg_at_100
394
+ value: 43.851
395
+ - type: ndcg_at_1000
396
+ value: 46.211999999999996
397
+ - type: ndcg_at_3
398
+ value: 35.675000000000004
399
+ - type: ndcg_at_5
400
+ value: 37.564
401
+ - type: precision_at_1
402
+ value: 32.357
403
+ - type: precision_at_10
404
+ value: 7.344
405
+ - type: precision_at_100
406
+ value: 1.201
407
+ - type: precision_at_1000
408
+ value: 0.168
409
+ - type: precision_at_3
410
+ value: 17.155
411
+ - type: precision_at_5
412
+ value: 12.166
413
+ - type: recall_at_1
414
+ value: 25.782
415
+ - type: recall_at_10
416
+ value: 49.132999999999996
417
+ - type: recall_at_100
418
+ value: 67.24
419
+ - type: recall_at_1000
420
+ value: 83.045
421
+ - type: recall_at_3
422
+ value: 37.021
423
+ - type: recall_at_5
424
+ value: 42.548
425
+ - task:
426
+ type: Retrieval
427
+ dataset:
428
+ type: BeIR/cqadupstack
429
+ name: MTEB CQADupstackGamingRetrieval
430
+ config: default
431
+ split: test
432
+ revision: None
433
+ metrics:
434
+ - type: map_at_1
435
+ value: 35.778999999999996
436
+ - type: map_at_10
437
+ value: 47.038000000000004
438
+ - type: map_at_100
439
+ value: 48.064
440
+ - type: map_at_1000
441
+ value: 48.128
442
+ - type: map_at_3
443
+ value: 44.186
444
+ - type: map_at_5
445
+ value: 45.788000000000004
446
+ - type: mrr_at_1
447
+ value: 41.254000000000005
448
+ - type: mrr_at_10
449
+ value: 50.556999999999995
450
+ - type: mrr_at_100
451
+ value: 51.296
452
+ - type: mrr_at_1000
453
+ value: 51.331
454
+ - type: mrr_at_3
455
+ value: 48.318
456
+ - type: mrr_at_5
457
+ value: 49.619
458
+ - type: ndcg_at_1
459
+ value: 41.254000000000005
460
+ - type: ndcg_at_10
461
+ value: 52.454
462
+ - type: ndcg_at_100
463
+ value: 56.776
464
+ - type: ndcg_at_1000
465
+ value: 58.181000000000004
466
+ - type: ndcg_at_3
467
+ value: 47.713
468
+ - type: ndcg_at_5
469
+ value: 49.997
470
+ - type: precision_at_1
471
+ value: 41.254000000000005
472
+ - type: precision_at_10
473
+ value: 8.464
474
+ - type: precision_at_100
475
+ value: 1.157
476
+ - type: precision_at_1000
477
+ value: 0.133
478
+ - type: precision_at_3
479
+ value: 21.526
480
+ - type: precision_at_5
481
+ value: 14.696000000000002
482
+ - type: recall_at_1
483
+ value: 35.778999999999996
484
+ - type: recall_at_10
485
+ value: 64.85300000000001
486
+ - type: recall_at_100
487
+ value: 83.98400000000001
488
+ - type: recall_at_1000
489
+ value: 94.18299999999999
490
+ - type: recall_at_3
491
+ value: 51.929
492
+ - type: recall_at_5
493
+ value: 57.666
494
+ - task:
495
+ type: Retrieval
496
+ dataset:
497
+ type: BeIR/cqadupstack
498
+ name: MTEB CQADupstackGisRetrieval
499
+ config: default
500
+ split: test
501
+ revision: None
502
+ metrics:
503
+ - type: map_at_1
504
+ value: 21.719
505
+ - type: map_at_10
506
+ value: 29.326999999999998
507
+ - type: map_at_100
508
+ value: 30.314000000000004
509
+ - type: map_at_1000
510
+ value: 30.397000000000002
511
+ - type: map_at_3
512
+ value: 27.101
513
+ - type: map_at_5
514
+ value: 28.141
515
+ - type: mrr_at_1
516
+ value: 23.503
517
+ - type: mrr_at_10
518
+ value: 31.225
519
+ - type: mrr_at_100
520
+ value: 32.096000000000004
521
+ - type: mrr_at_1000
522
+ value: 32.159
523
+ - type: mrr_at_3
524
+ value: 29.076999999999998
525
+ - type: mrr_at_5
526
+ value: 30.083
527
+ - type: ndcg_at_1
528
+ value: 23.503
529
+ - type: ndcg_at_10
530
+ value: 33.842
531
+ - type: ndcg_at_100
532
+ value: 39.038000000000004
533
+ - type: ndcg_at_1000
534
+ value: 41.214
535
+ - type: ndcg_at_3
536
+ value: 29.347
537
+ - type: ndcg_at_5
538
+ value: 31.121
539
+ - type: precision_at_1
540
+ value: 23.503
541
+ - type: precision_at_10
542
+ value: 5.266
543
+ - type: precision_at_100
544
+ value: 0.831
545
+ - type: precision_at_1000
546
+ value: 0.106
547
+ - type: precision_at_3
548
+ value: 12.504999999999999
549
+ - type: precision_at_5
550
+ value: 8.565000000000001
551
+ - type: recall_at_1
552
+ value: 21.719
553
+ - type: recall_at_10
554
+ value: 46.024
555
+ - type: recall_at_100
556
+ value: 70.78999999999999
557
+ - type: recall_at_1000
558
+ value: 87.022
559
+ - type: recall_at_3
560
+ value: 33.64
561
+ - type: recall_at_5
562
+ value: 37.992
563
+ - task:
564
+ type: Retrieval
565
+ dataset:
566
+ type: BeIR/cqadupstack
567
+ name: MTEB CQADupstackMathematicaRetrieval
568
+ config: default
569
+ split: test
570
+ revision: None
571
+ metrics:
572
+ - type: map_at_1
573
+ value: 15.601
574
+ - type: map_at_10
575
+ value: 22.054000000000002
576
+ - type: map_at_100
577
+ value: 23.177
578
+ - type: map_at_1000
579
+ value: 23.308
580
+ - type: map_at_3
581
+ value: 19.772000000000002
582
+ - type: map_at_5
583
+ value: 21.055
584
+ - type: mrr_at_1
585
+ value: 19.403000000000002
586
+ - type: mrr_at_10
587
+ value: 26.409
588
+ - type: mrr_at_100
589
+ value: 27.356
590
+ - type: mrr_at_1000
591
+ value: 27.441
592
+ - type: mrr_at_3
593
+ value: 24.108999999999998
594
+ - type: mrr_at_5
595
+ value: 25.427
596
+ - type: ndcg_at_1
597
+ value: 19.403000000000002
598
+ - type: ndcg_at_10
599
+ value: 26.474999999999998
600
+ - type: ndcg_at_100
601
+ value: 32.086
602
+ - type: ndcg_at_1000
603
+ value: 35.231
604
+ - type: ndcg_at_3
605
+ value: 22.289
606
+ - type: ndcg_at_5
607
+ value: 24.271
608
+ - type: precision_at_1
609
+ value: 19.403000000000002
610
+ - type: precision_at_10
611
+ value: 4.813
612
+ - type: precision_at_100
613
+ value: 0.8869999999999999
614
+ - type: precision_at_1000
615
+ value: 0.13
616
+ - type: precision_at_3
617
+ value: 10.531
618
+ - type: precision_at_5
619
+ value: 7.710999999999999
620
+ - type: recall_at_1
621
+ value: 15.601
622
+ - type: recall_at_10
623
+ value: 35.916
624
+ - type: recall_at_100
625
+ value: 60.8
626
+ - type: recall_at_1000
627
+ value: 83.245
628
+ - type: recall_at_3
629
+ value: 24.321
630
+ - type: recall_at_5
631
+ value: 29.372999999999998
632
+ - task:
633
+ type: Retrieval
634
+ dataset:
635
+ type: BeIR/cqadupstack
636
+ name: MTEB CQADupstackPhysicsRetrieval
637
+ config: default
638
+ split: test
639
+ revision: None
640
+ metrics:
641
+ - type: map_at_1
642
+ value: 25.522
643
+ - type: map_at_10
644
+ value: 34.854
645
+ - type: map_at_100
646
+ value: 36.269
647
+ - type: map_at_1000
648
+ value: 36.387
649
+ - type: map_at_3
650
+ value: 32.187
651
+ - type: map_at_5
652
+ value: 33.692
653
+ - type: mrr_at_1
654
+ value: 31.375999999999998
655
+ - type: mrr_at_10
656
+ value: 40.471000000000004
657
+ - type: mrr_at_100
658
+ value: 41.481
659
+ - type: mrr_at_1000
660
+ value: 41.533
661
+ - type: mrr_at_3
662
+ value: 38.274
663
+ - type: mrr_at_5
664
+ value: 39.612
665
+ - type: ndcg_at_1
666
+ value: 31.375999999999998
667
+ - type: ndcg_at_10
668
+ value: 40.298
669
+ - type: ndcg_at_100
670
+ value: 46.255
671
+ - type: ndcg_at_1000
672
+ value: 48.522
673
+ - type: ndcg_at_3
674
+ value: 36.049
675
+ - type: ndcg_at_5
676
+ value: 38.095
677
+ - type: precision_at_1
678
+ value: 31.375999999999998
679
+ - type: precision_at_10
680
+ value: 7.305000000000001
681
+ - type: precision_at_100
682
+ value: 1.201
683
+ - type: precision_at_1000
684
+ value: 0.157
685
+ - type: precision_at_3
686
+ value: 17.132
687
+ - type: precision_at_5
688
+ value: 12.107999999999999
689
+ - type: recall_at_1
690
+ value: 25.522
691
+ - type: recall_at_10
692
+ value: 50.988
693
+ - type: recall_at_100
694
+ value: 76.005
695
+ - type: recall_at_1000
696
+ value: 91.11200000000001
697
+ - type: recall_at_3
698
+ value: 38.808
699
+ - type: recall_at_5
700
+ value: 44.279
701
+ - task:
702
+ type: Retrieval
703
+ dataset:
704
+ type: BeIR/cqadupstack
705
+ name: MTEB CQADupstackProgrammersRetrieval
706
+ config: default
707
+ split: test
708
+ revision: None
709
+ metrics:
710
+ - type: map_at_1
711
+ value: 24.615000000000002
712
+ - type: map_at_10
713
+ value: 32.843
714
+ - type: map_at_100
715
+ value: 34.172999999999995
716
+ - type: map_at_1000
717
+ value: 34.286
718
+ - type: map_at_3
719
+ value: 30.125
720
+ - type: map_at_5
721
+ value: 31.495
722
+ - type: mrr_at_1
723
+ value: 30.023
724
+ - type: mrr_at_10
725
+ value: 38.106
726
+ - type: mrr_at_100
727
+ value: 39.01
728
+ - type: mrr_at_1000
729
+ value: 39.071
730
+ - type: mrr_at_3
731
+ value: 35.674
732
+ - type: mrr_at_5
733
+ value: 36.924
734
+ - type: ndcg_at_1
735
+ value: 30.023
736
+ - type: ndcg_at_10
737
+ value: 38.091
738
+ - type: ndcg_at_100
739
+ value: 43.771
740
+ - type: ndcg_at_1000
741
+ value: 46.315
742
+ - type: ndcg_at_3
743
+ value: 33.507
744
+ - type: ndcg_at_5
745
+ value: 35.304
746
+ - type: precision_at_1
747
+ value: 30.023
748
+ - type: precision_at_10
749
+ value: 6.837999999999999
750
+ - type: precision_at_100
751
+ value: 1.124
752
+ - type: precision_at_1000
753
+ value: 0.152
754
+ - type: precision_at_3
755
+ value: 15.562999999999999
756
+ - type: precision_at_5
757
+ value: 10.936
758
+ - type: recall_at_1
759
+ value: 24.615000000000002
760
+ - type: recall_at_10
761
+ value: 48.691
762
+ - type: recall_at_100
763
+ value: 72.884
764
+ - type: recall_at_1000
765
+ value: 90.387
766
+ - type: recall_at_3
767
+ value: 35.659
768
+ - type: recall_at_5
769
+ value: 40.602
770
+ - task:
771
+ type: Retrieval
772
+ dataset:
773
+ type: BeIR/cqadupstack
774
+ name: MTEB CQADupstackRetrieval
775
+ config: default
776
+ split: test
777
+ revision: None
778
+ metrics:
779
+ - type: map_at_1
780
+ value: 23.223666666666666
781
+ - type: map_at_10
782
+ value: 31.338166666666673
783
+ - type: map_at_100
784
+ value: 32.47358333333333
785
+ - type: map_at_1000
786
+ value: 32.5955
787
+ - type: map_at_3
788
+ value: 28.84133333333333
789
+ - type: map_at_5
790
+ value: 30.20808333333333
791
+ - type: mrr_at_1
792
+ value: 27.62483333333333
793
+ - type: mrr_at_10
794
+ value: 35.385916666666674
795
+ - type: mrr_at_100
796
+ value: 36.23325
797
+ - type: mrr_at_1000
798
+ value: 36.29966666666667
799
+ - type: mrr_at_3
800
+ value: 33.16583333333333
801
+ - type: mrr_at_5
802
+ value: 34.41983333333334
803
+ - type: ndcg_at_1
804
+ value: 27.62483333333333
805
+ - type: ndcg_at_10
806
+ value: 36.222
807
+ - type: ndcg_at_100
808
+ value: 41.29491666666666
809
+ - type: ndcg_at_1000
810
+ value: 43.85508333333333
811
+ - type: ndcg_at_3
812
+ value: 31.95116666666667
813
+ - type: ndcg_at_5
814
+ value: 33.88541666666667
815
+ - type: precision_at_1
816
+ value: 27.62483333333333
817
+ - type: precision_at_10
818
+ value: 6.339916666666667
819
+ - type: precision_at_100
820
+ value: 1.0483333333333333
821
+ - type: precision_at_1000
822
+ value: 0.14608333333333334
823
+ - type: precision_at_3
824
+ value: 14.726500000000003
825
+ - type: precision_at_5
826
+ value: 10.395
827
+ - type: recall_at_1
828
+ value: 23.223666666666666
829
+ - type: recall_at_10
830
+ value: 46.778999999999996
831
+ - type: recall_at_100
832
+ value: 69.27141666666667
833
+ - type: recall_at_1000
834
+ value: 87.27383333333334
835
+ - type: recall_at_3
836
+ value: 34.678749999999994
837
+ - type: recall_at_5
838
+ value: 39.79900000000001
839
+ - task:
840
+ type: Retrieval
841
+ dataset:
842
+ type: BeIR/cqadupstack
843
+ name: MTEB CQADupstackStatsRetrieval
844
+ config: default
845
+ split: test
846
+ revision: None
847
+ metrics:
848
+ - type: map_at_1
849
+ value: 21.677
850
+ - type: map_at_10
851
+ value: 27.828000000000003
852
+ - type: map_at_100
853
+ value: 28.538999999999998
854
+ - type: map_at_1000
855
+ value: 28.64
856
+ - type: map_at_3
857
+ value: 26.105
858
+ - type: map_at_5
859
+ value: 27.009
860
+ - type: mrr_at_1
861
+ value: 24.387
862
+ - type: mrr_at_10
863
+ value: 30.209999999999997
864
+ - type: mrr_at_100
865
+ value: 30.953000000000003
866
+ - type: mrr_at_1000
867
+ value: 31.029
868
+ - type: mrr_at_3
869
+ value: 28.707
870
+ - type: mrr_at_5
871
+ value: 29.610999999999997
872
+ - type: ndcg_at_1
873
+ value: 24.387
874
+ - type: ndcg_at_10
875
+ value: 31.378
876
+ - type: ndcg_at_100
877
+ value: 35.249
878
+ - type: ndcg_at_1000
879
+ value: 37.923
880
+ - type: ndcg_at_3
881
+ value: 28.213
882
+ - type: ndcg_at_5
883
+ value: 29.658
884
+ - type: precision_at_1
885
+ value: 24.387
886
+ - type: precision_at_10
887
+ value: 4.8309999999999995
888
+ - type: precision_at_100
889
+ value: 0.73
890
+ - type: precision_at_1000
891
+ value: 0.104
892
+ - type: precision_at_3
893
+ value: 12.168
894
+ - type: precision_at_5
895
+ value: 8.251999999999999
896
+ - type: recall_at_1
897
+ value: 21.677
898
+ - type: recall_at_10
899
+ value: 40.069
900
+ - type: recall_at_100
901
+ value: 58.077
902
+ - type: recall_at_1000
903
+ value: 77.97
904
+ - type: recall_at_3
905
+ value: 31.03
906
+ - type: recall_at_5
907
+ value: 34.838
908
+ - task:
909
+ type: Retrieval
910
+ dataset:
911
+ type: BeIR/cqadupstack
912
+ name: MTEB CQADupstackTexRetrieval
913
+ config: default
914
+ split: test
915
+ revision: None
916
+ metrics:
917
+ - type: map_at_1
918
+ value: 14.484
919
+ - type: map_at_10
920
+ value: 20.355
921
+ - type: map_at_100
922
+ value: 21.382
923
+ - type: map_at_1000
924
+ value: 21.511
925
+ - type: map_at_3
926
+ value: 18.448
927
+ - type: map_at_5
928
+ value: 19.451999999999998
929
+ - type: mrr_at_1
930
+ value: 17.584
931
+ - type: mrr_at_10
932
+ value: 23.825
933
+ - type: mrr_at_100
934
+ value: 24.704
935
+ - type: mrr_at_1000
936
+ value: 24.793000000000003
937
+ - type: mrr_at_3
938
+ value: 21.92
939
+ - type: mrr_at_5
940
+ value: 22.97
941
+ - type: ndcg_at_1
942
+ value: 17.584
943
+ - type: ndcg_at_10
944
+ value: 24.315
945
+ - type: ndcg_at_100
946
+ value: 29.354999999999997
947
+ - type: ndcg_at_1000
948
+ value: 32.641999999999996
949
+ - type: ndcg_at_3
950
+ value: 20.802
951
+ - type: ndcg_at_5
952
+ value: 22.335
953
+ - type: precision_at_1
954
+ value: 17.584
955
+ - type: precision_at_10
956
+ value: 4.443
957
+ - type: precision_at_100
958
+ value: 0.8160000000000001
959
+ - type: precision_at_1000
960
+ value: 0.128
961
+ - type: precision_at_3
962
+ value: 9.807
963
+ - type: precision_at_5
964
+ value: 7.0889999999999995
965
+ - type: recall_at_1
966
+ value: 14.484
967
+ - type: recall_at_10
968
+ value: 32.804
969
+ - type: recall_at_100
970
+ value: 55.679
971
+ - type: recall_at_1000
972
+ value: 79.63
973
+ - type: recall_at_3
974
+ value: 22.976
975
+ - type: recall_at_5
976
+ value: 26.939
977
+ - task:
978
+ type: Retrieval
979
+ dataset:
980
+ type: BeIR/cqadupstack
981
+ name: MTEB CQADupstackUnixRetrieval
982
+ config: default
983
+ split: test
984
+ revision: None
985
+ metrics:
986
+ - type: map_at_1
987
+ value: 22.983999999999998
988
+ - type: map_at_10
989
+ value: 30.812
990
+ - type: map_at_100
991
+ value: 31.938
992
+ - type: map_at_1000
993
+ value: 32.056000000000004
994
+ - type: map_at_3
995
+ value: 28.449999999999996
996
+ - type: map_at_5
997
+ value: 29.542
998
+ - type: mrr_at_1
999
+ value: 27.145999999999997
1000
+ - type: mrr_at_10
1001
+ value: 34.782999999999994
1002
+ - type: mrr_at_100
1003
+ value: 35.699
1004
+ - type: mrr_at_1000
1005
+ value: 35.768
1006
+ - type: mrr_at_3
1007
+ value: 32.572
1008
+ - type: mrr_at_5
1009
+ value: 33.607
1010
+ - type: ndcg_at_1
1011
+ value: 27.145999999999997
1012
+ - type: ndcg_at_10
1013
+ value: 35.722
1014
+ - type: ndcg_at_100
1015
+ value: 40.964
1016
+ - type: ndcg_at_1000
1017
+ value: 43.598
1018
+ - type: ndcg_at_3
1019
+ value: 31.379
1020
+ - type: ndcg_at_5
1021
+ value: 32.924
1022
+ - type: precision_at_1
1023
+ value: 27.145999999999997
1024
+ - type: precision_at_10
1025
+ value: 6.063000000000001
1026
+ - type: precision_at_100
1027
+ value: 0.9730000000000001
1028
+ - type: precision_at_1000
1029
+ value: 0.13
1030
+ - type: precision_at_3
1031
+ value: 14.366000000000001
1032
+ - type: precision_at_5
1033
+ value: 9.776
1034
+ - type: recall_at_1
1035
+ value: 22.983999999999998
1036
+ - type: recall_at_10
1037
+ value: 46.876
1038
+ - type: recall_at_100
1039
+ value: 69.646
1040
+ - type: recall_at_1000
1041
+ value: 88.305
1042
+ - type: recall_at_3
1043
+ value: 34.471000000000004
1044
+ - type: recall_at_5
1045
+ value: 38.76
1046
+ - task:
1047
+ type: Retrieval
1048
+ dataset:
1049
+ type: BeIR/cqadupstack
1050
+ name: MTEB CQADupstackWebmastersRetrieval
1051
+ config: default
1052
+ split: test
1053
+ revision: None
1054
+ metrics:
1055
+ - type: map_at_1
1056
+ value: 23.017000000000003
1057
+ - type: map_at_10
1058
+ value: 31.049
1059
+ - type: map_at_100
1060
+ value: 32.582
1061
+ - type: map_at_1000
1062
+ value: 32.817
1063
+ - type: map_at_3
1064
+ value: 28.303
1065
+ - type: map_at_5
1066
+ value: 29.854000000000003
1067
+ - type: mrr_at_1
1068
+ value: 27.866000000000003
1069
+ - type: mrr_at_10
1070
+ value: 35.56
1071
+ - type: mrr_at_100
1072
+ value: 36.453
1073
+ - type: mrr_at_1000
1074
+ value: 36.519
1075
+ - type: mrr_at_3
1076
+ value: 32.938
1077
+ - type: mrr_at_5
1078
+ value: 34.391
1079
+ - type: ndcg_at_1
1080
+ value: 27.866000000000003
1081
+ - type: ndcg_at_10
1082
+ value: 36.506
1083
+ - type: ndcg_at_100
1084
+ value: 42.344
1085
+ - type: ndcg_at_1000
1086
+ value: 45.213
1087
+ - type: ndcg_at_3
1088
+ value: 31.805
1089
+ - type: ndcg_at_5
1090
+ value: 33.933
1091
+ - type: precision_at_1
1092
+ value: 27.866000000000003
1093
+ - type: precision_at_10
1094
+ value: 7.016
1095
+ - type: precision_at_100
1096
+ value: 1.468
1097
+ - type: precision_at_1000
1098
+ value: 0.23900000000000002
1099
+ - type: precision_at_3
1100
+ value: 14.822
1101
+ - type: precision_at_5
1102
+ value: 10.791
1103
+ - type: recall_at_1
1104
+ value: 23.017000000000003
1105
+ - type: recall_at_10
1106
+ value: 47.053
1107
+ - type: recall_at_100
1108
+ value: 73.177
1109
+ - type: recall_at_1000
1110
+ value: 91.47800000000001
1111
+ - type: recall_at_3
1112
+ value: 33.675
1113
+ - type: recall_at_5
1114
+ value: 39.36
1115
+ - task:
1116
+ type: Retrieval
1117
+ dataset:
1118
+ type: BeIR/cqadupstack
1119
+ name: MTEB CQADupstackWordpressRetrieval
1120
+ config: default
1121
+ split: test
1122
+ revision: None
1123
+ metrics:
1124
+ - type: map_at_1
1125
+ value: 16.673
1126
+ - type: map_at_10
1127
+ value: 24.051000000000002
1128
+ - type: map_at_100
1129
+ value: 24.933
1130
+ - type: map_at_1000
1131
+ value: 25.06
1132
+ - type: map_at_3
1133
+ value: 21.446
1134
+ - type: map_at_5
1135
+ value: 23.064
1136
+ - type: mrr_at_1
1137
+ value: 18.115000000000002
1138
+ - type: mrr_at_10
1139
+ value: 25.927
1140
+ - type: mrr_at_100
1141
+ value: 26.718999999999998
1142
+ - type: mrr_at_1000
1143
+ value: 26.817999999999998
1144
+ - type: mrr_at_3
1145
+ value: 23.383000000000003
1146
+ - type: mrr_at_5
1147
+ value: 25.008999999999997
1148
+ - type: ndcg_at_1
1149
+ value: 18.115000000000002
1150
+ - type: ndcg_at_10
1151
+ value: 28.669
1152
+ - type: ndcg_at_100
1153
+ value: 33.282000000000004
1154
+ - type: ndcg_at_1000
1155
+ value: 36.481
1156
+ - type: ndcg_at_3
1157
+ value: 23.574
1158
+ - type: ndcg_at_5
1159
+ value: 26.340000000000003
1160
+ - type: precision_at_1
1161
+ value: 18.115000000000002
1162
+ - type: precision_at_10
1163
+ value: 4.769
1164
+ - type: precision_at_100
1165
+ value: 0.767
1166
+ - type: precision_at_1000
1167
+ value: 0.116
1168
+ - type: precision_at_3
1169
+ value: 10.351
1170
+ - type: precision_at_5
1171
+ value: 7.8
1172
+ - type: recall_at_1
1173
+ value: 16.673
1174
+ - type: recall_at_10
1175
+ value: 41.063
1176
+ - type: recall_at_100
1177
+ value: 62.851
1178
+ - type: recall_at_1000
1179
+ value: 86.701
1180
+ - type: recall_at_3
1181
+ value: 27.532
1182
+ - type: recall_at_5
1183
+ value: 34.076
1184
+ - task:
1185
+ type: Retrieval
1186
+ dataset:
1187
+ type: climate-fever
1188
+ name: MTEB ClimateFEVER
1189
+ config: default
1190
+ split: test
1191
+ revision: None
1192
+ metrics:
1193
+ - type: map_at_1
1194
+ value: 8.752
1195
+ - type: map_at_10
1196
+ value: 15.120000000000001
1197
+ - type: map_at_100
1198
+ value: 16.678
1199
+ - type: map_at_1000
1200
+ value: 16.854
1201
+ - type: map_at_3
1202
+ value: 12.603
1203
+ - type: map_at_5
1204
+ value: 13.918
1205
+ - type: mrr_at_1
1206
+ value: 19.283
1207
+ - type: mrr_at_10
1208
+ value: 29.145
1209
+ - type: mrr_at_100
1210
+ value: 30.281000000000002
1211
+ - type: mrr_at_1000
1212
+ value: 30.339
1213
+ - type: mrr_at_3
1214
+ value: 26.069
1215
+ - type: mrr_at_5
1216
+ value: 27.864
1217
+ - type: ndcg_at_1
1218
+ value: 19.283
1219
+ - type: ndcg_at_10
1220
+ value: 21.804000000000002
1221
+ - type: ndcg_at_100
1222
+ value: 28.576
1223
+ - type: ndcg_at_1000
1224
+ value: 32.063
1225
+ - type: ndcg_at_3
1226
+ value: 17.511
1227
+ - type: ndcg_at_5
1228
+ value: 19.112000000000002
1229
+ - type: precision_at_1
1230
+ value: 19.283
1231
+ - type: precision_at_10
1232
+ value: 6.873
1233
+ - type: precision_at_100
1234
+ value: 1.405
1235
+ - type: precision_at_1000
1236
+ value: 0.20500000000000002
1237
+ - type: precision_at_3
1238
+ value: 13.16
1239
+ - type: precision_at_5
1240
+ value: 10.189
1241
+ - type: recall_at_1
1242
+ value: 8.752
1243
+ - type: recall_at_10
1244
+ value: 27.004
1245
+ - type: recall_at_100
1246
+ value: 50.648
1247
+ - type: recall_at_1000
1248
+ value: 70.458
1249
+ - type: recall_at_3
1250
+ value: 16.461000000000002
1251
+ - type: recall_at_5
1252
+ value: 20.973
1253
+ - task:
1254
+ type: Retrieval
1255
+ dataset:
1256
+ type: dbpedia-entity
1257
+ name: MTEB DBPedia
1258
+ config: default
1259
+ split: test
1260
+ revision: None
1261
+ metrics:
1262
+ - type: map_at_1
1263
+ value: 6.81
1264
+ - type: map_at_10
1265
+ value: 14.056
1266
+ - type: map_at_100
1267
+ value: 18.961
1268
+ - type: map_at_1000
1269
+ value: 20.169
1270
+ - type: map_at_3
1271
+ value: 10.496
1272
+ - type: map_at_5
1273
+ value: 11.952
1274
+ - type: mrr_at_1
1275
+ value: 53.5
1276
+ - type: mrr_at_10
1277
+ value: 63.479
1278
+ - type: mrr_at_100
1279
+ value: 63.971999999999994
1280
+ - type: mrr_at_1000
1281
+ value: 63.993
1282
+ - type: mrr_at_3
1283
+ value: 61.541999999999994
1284
+ - type: mrr_at_5
1285
+ value: 62.778999999999996
1286
+ - type: ndcg_at_1
1287
+ value: 42.25
1288
+ - type: ndcg_at_10
1289
+ value: 31.471
1290
+ - type: ndcg_at_100
1291
+ value: 35.115
1292
+ - type: ndcg_at_1000
1293
+ value: 42.408
1294
+ - type: ndcg_at_3
1295
+ value: 35.458
1296
+ - type: ndcg_at_5
1297
+ value: 32.973
1298
+ - type: precision_at_1
1299
+ value: 53.5
1300
+ - type: precision_at_10
1301
+ value: 24.85
1302
+ - type: precision_at_100
1303
+ value: 7.79
1304
+ - type: precision_at_1000
1305
+ value: 1.599
1306
+ - type: precision_at_3
1307
+ value: 38.667
1308
+ - type: precision_at_5
1309
+ value: 31.55
1310
+ - type: recall_at_1
1311
+ value: 6.81
1312
+ - type: recall_at_10
1313
+ value: 19.344
1314
+ - type: recall_at_100
1315
+ value: 40.837
1316
+ - type: recall_at_1000
1317
+ value: 64.661
1318
+ - type: recall_at_3
1319
+ value: 11.942
1320
+ - type: recall_at_5
1321
+ value: 14.646
1322
+ - task:
1323
+ type: Classification
1324
+ dataset:
1325
+ type: mteb/emotion
1326
+ name: MTEB EmotionClassification
1327
+ config: default
1328
+ split: test
1329
+ revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
1330
+ metrics:
1331
+ - type: accuracy
1332
+ value: 44.64499999999999
1333
+ - type: f1
1334
+ value: 39.39106911352714
1335
+ - task:
1336
+ type: Retrieval
1337
+ dataset:
1338
+ type: fever
1339
+ name: MTEB FEVER
1340
+ config: default
1341
+ split: test
1342
+ revision: None
1343
+ metrics:
1344
+ - type: map_at_1
1345
+ value: 48.196
1346
+ - type: map_at_10
1347
+ value: 61.404
1348
+ - type: map_at_100
1349
+ value: 61.846000000000004
1350
+ - type: map_at_1000
1351
+ value: 61.866
1352
+ - type: map_at_3
1353
+ value: 58.975
1354
+ - type: map_at_5
1355
+ value: 60.525
1356
+ - type: mrr_at_1
1357
+ value: 52.025
1358
+ - type: mrr_at_10
1359
+ value: 65.43299999999999
1360
+ - type: mrr_at_100
1361
+ value: 65.80799999999999
1362
+ - type: mrr_at_1000
1363
+ value: 65.818
1364
+ - type: mrr_at_3
1365
+ value: 63.146
1366
+ - type: mrr_at_5
1367
+ value: 64.64
1368
+ - type: ndcg_at_1
1369
+ value: 52.025
1370
+ - type: ndcg_at_10
1371
+ value: 67.889
1372
+ - type: ndcg_at_100
1373
+ value: 69.864
1374
+ - type: ndcg_at_1000
1375
+ value: 70.337
1376
+ - type: ndcg_at_3
1377
+ value: 63.315
1378
+ - type: ndcg_at_5
1379
+ value: 65.91799999999999
1380
+ - type: precision_at_1
1381
+ value: 52.025
1382
+ - type: precision_at_10
1383
+ value: 9.182
1384
+ - type: precision_at_100
1385
+ value: 1.027
1386
+ - type: precision_at_1000
1387
+ value: 0.108
1388
+ - type: precision_at_3
1389
+ value: 25.968000000000004
1390
+ - type: precision_at_5
1391
+ value: 17.006
1392
+ - type: recall_at_1
1393
+ value: 48.196
1394
+ - type: recall_at_10
1395
+ value: 83.885
1396
+ - type: recall_at_100
1397
+ value: 92.671
1398
+ - type: recall_at_1000
1399
+ value: 96.018
1400
+ - type: recall_at_3
1401
+ value: 71.59
1402
+ - type: recall_at_5
1403
+ value: 77.946
1404
+ - task:
1405
+ type: Retrieval
1406
+ dataset:
1407
+ type: fiqa
1408
+ name: MTEB FiQA2018
1409
+ config: default
1410
+ split: test
1411
+ revision: None
1412
+ metrics:
1413
+ - type: map_at_1
1414
+ value: 15.193000000000001
1415
+ - type: map_at_10
1416
+ value: 25.168000000000003
1417
+ - type: map_at_100
1418
+ value: 27.017000000000003
1419
+ - type: map_at_1000
1420
+ value: 27.205000000000002
1421
+ - type: map_at_3
1422
+ value: 21.746
1423
+ - type: map_at_5
1424
+ value: 23.579
1425
+ - type: mrr_at_1
1426
+ value: 31.635999999999996
1427
+ - type: mrr_at_10
1428
+ value: 40.077
1429
+ - type: mrr_at_100
1430
+ value: 41.112
1431
+ - type: mrr_at_1000
1432
+ value: 41.160999999999994
1433
+ - type: mrr_at_3
1434
+ value: 37.937
1435
+ - type: mrr_at_5
1436
+ value: 39.18
1437
+ - type: ndcg_at_1
1438
+ value: 31.635999999999996
1439
+ - type: ndcg_at_10
1440
+ value: 32.298
1441
+ - type: ndcg_at_100
1442
+ value: 39.546
1443
+ - type: ndcg_at_1000
1444
+ value: 42.88
1445
+ - type: ndcg_at_3
1446
+ value: 29.221999999999998
1447
+ - type: ndcg_at_5
1448
+ value: 30.069000000000003
1449
+ - type: precision_at_1
1450
+ value: 31.635999999999996
1451
+ - type: precision_at_10
1452
+ value: 9.367
1453
+ - type: precision_at_100
1454
+ value: 1.645
1455
+ - type: precision_at_1000
1456
+ value: 0.22399999999999998
1457
+ - type: precision_at_3
1458
+ value: 20.01
1459
+ - type: precision_at_5
1460
+ value: 14.753
1461
+ - type: recall_at_1
1462
+ value: 15.193000000000001
1463
+ - type: recall_at_10
1464
+ value: 38.214999999999996
1465
+ - type: recall_at_100
1466
+ value: 65.95
1467
+ - type: recall_at_1000
1468
+ value: 85.85300000000001
1469
+ - type: recall_at_3
1470
+ value: 26.357000000000003
1471
+ - type: recall_at_5
1472
+ value: 31.319999999999997
1473
+ - task:
1474
+ type: Retrieval
1475
+ dataset:
1476
+ type: jinaai/ger_da_lir
1477
+ name: MTEB GerDaLIR
1478
+ config: default
1479
+ split: test
1480
+ revision: None
1481
+ metrics:
1482
+ - type: map_at_1
1483
+ value: 10.363
1484
+ - type: map_at_10
1485
+ value: 16.222
1486
+ - type: map_at_100
1487
+ value: 17.28
1488
+ - type: map_at_1000
1489
+ value: 17.380000000000003
1490
+ - type: map_at_3
1491
+ value: 14.054
1492
+ - type: map_at_5
1493
+ value: 15.203
1494
+ - type: mrr_at_1
1495
+ value: 11.644
1496
+ - type: mrr_at_10
1497
+ value: 17.625
1498
+ - type: mrr_at_100
1499
+ value: 18.608
1500
+ - type: mrr_at_1000
1501
+ value: 18.695999999999998
1502
+ - type: mrr_at_3
1503
+ value: 15.481
1504
+ - type: mrr_at_5
1505
+ value: 16.659
1506
+ - type: ndcg_at_1
1507
+ value: 11.628
1508
+ - type: ndcg_at_10
1509
+ value: 20.028000000000002
1510
+ - type: ndcg_at_100
1511
+ value: 25.505
1512
+ - type: ndcg_at_1000
1513
+ value: 28.288000000000004
1514
+ - type: ndcg_at_3
1515
+ value: 15.603
1516
+ - type: ndcg_at_5
1517
+ value: 17.642
1518
+ - type: precision_at_1
1519
+ value: 11.628
1520
+ - type: precision_at_10
1521
+ value: 3.5589999999999997
1522
+ - type: precision_at_100
1523
+ value: 0.664
1524
+ - type: precision_at_1000
1525
+ value: 0.092
1526
+ - type: precision_at_3
1527
+ value: 7.109999999999999
1528
+ - type: precision_at_5
1529
+ value: 5.401
1530
+ - type: recall_at_1
1531
+ value: 10.363
1532
+ - type: recall_at_10
1533
+ value: 30.586000000000002
1534
+ - type: recall_at_100
1535
+ value: 56.43
1536
+ - type: recall_at_1000
1537
+ value: 78.142
1538
+ - type: recall_at_3
1539
+ value: 18.651
1540
+ - type: recall_at_5
1541
+ value: 23.493
1542
+ - task:
1543
+ type: Retrieval
1544
+ dataset:
1545
+ type: deepset/germandpr
1546
+ name: MTEB GermanDPR
1547
+ config: default
1548
+ split: test
1549
+ revision: 5129d02422a66be600ac89cd3e8531b4f97d347d
1550
+ metrics:
1551
+ - type: map_at_1
1552
+ value: 60.78
1553
+ - type: map_at_10
1554
+ value: 73.91499999999999
1555
+ - type: map_at_100
1556
+ value: 74.089
1557
+ - type: map_at_1000
1558
+ value: 74.09400000000001
1559
+ - type: map_at_3
1560
+ value: 71.87
1561
+ - type: map_at_5
1562
+ value: 73.37700000000001
1563
+ - type: mrr_at_1
1564
+ value: 60.78
1565
+ - type: mrr_at_10
1566
+ value: 73.91499999999999
1567
+ - type: mrr_at_100
1568
+ value: 74.089
1569
+ - type: mrr_at_1000
1570
+ value: 74.09400000000001
1571
+ - type: mrr_at_3
1572
+ value: 71.87
1573
+ - type: mrr_at_5
1574
+ value: 73.37700000000001
1575
+ - type: ndcg_at_1
1576
+ value: 60.78
1577
+ - type: ndcg_at_10
1578
+ value: 79.35600000000001
1579
+ - type: ndcg_at_100
1580
+ value: 80.077
1581
+ - type: ndcg_at_1000
1582
+ value: 80.203
1583
+ - type: ndcg_at_3
1584
+ value: 75.393
1585
+ - type: ndcg_at_5
1586
+ value: 78.077
1587
+ - type: precision_at_1
1588
+ value: 60.78
1589
+ - type: precision_at_10
1590
+ value: 9.59
1591
+ - type: precision_at_100
1592
+ value: 0.9900000000000001
1593
+ - type: precision_at_1000
1594
+ value: 0.1
1595
+ - type: precision_at_3
1596
+ value: 28.52
1597
+ - type: precision_at_5
1598
+ value: 18.4
1599
+ - type: recall_at_1
1600
+ value: 60.78
1601
+ - type: recall_at_10
1602
+ value: 95.902
1603
+ - type: recall_at_100
1604
+ value: 99.024
1605
+ - type: recall_at_1000
1606
+ value: 100.0
1607
+ - type: recall_at_3
1608
+ value: 85.56099999999999
1609
+ - type: recall_at_5
1610
+ value: 92.0
1611
+ - task:
1612
+ type: STS
1613
+ dataset:
1614
+ type: jinaai/german-STSbenchmark
1615
+ name: MTEB GermanSTSBenchmark
1616
+ config: default
1617
+ split: test
1618
+ revision: 49d9b423b996fea62b483f9ee6dfb5ec233515ca
1619
+ metrics:
1620
+ - type: cos_sim_pearson
1621
+ value: 88.49524420894356
1622
+ - type: cos_sim_spearman
1623
+ value: 88.32407839427714
1624
+ - type: euclidean_pearson
1625
+ value: 87.25098779877104
1626
+ - type: euclidean_spearman
1627
+ value: 88.22738098593608
1628
+ - type: manhattan_pearson
1629
+ value: 87.23872691839607
1630
+ - type: manhattan_spearman
1631
+ value: 88.2002968380165
1632
+ - task:
1633
+ type: Retrieval
1634
+ dataset:
1635
+ type: hotpotqa
1636
+ name: MTEB HotpotQA
1637
+ config: default
1638
+ split: test
1639
+ revision: None
1640
+ metrics:
1641
+ - type: map_at_1
1642
+ value: 31.81
1643
+ - type: map_at_10
1644
+ value: 46.238
1645
+ - type: map_at_100
1646
+ value: 47.141
1647
+ - type: map_at_1000
1648
+ value: 47.213
1649
+ - type: map_at_3
1650
+ value: 43.248999999999995
1651
+ - type: map_at_5
1652
+ value: 45.078
1653
+ - type: mrr_at_1
1654
+ value: 63.619
1655
+ - type: mrr_at_10
1656
+ value: 71.279
1657
+ - type: mrr_at_100
1658
+ value: 71.648
1659
+ - type: mrr_at_1000
1660
+ value: 71.665
1661
+ - type: mrr_at_3
1662
+ value: 69.76599999999999
1663
+ - type: mrr_at_5
1664
+ value: 70.743
1665
+ - type: ndcg_at_1
1666
+ value: 63.619
1667
+ - type: ndcg_at_10
1668
+ value: 55.38999999999999
1669
+ - type: ndcg_at_100
1670
+ value: 58.80800000000001
1671
+ - type: ndcg_at_1000
1672
+ value: 60.331999999999994
1673
+ - type: ndcg_at_3
1674
+ value: 50.727
1675
+ - type: ndcg_at_5
1676
+ value: 53.284
1677
+ - type: precision_at_1
1678
+ value: 63.619
1679
+ - type: precision_at_10
1680
+ value: 11.668000000000001
1681
+ - type: precision_at_100
1682
+ value: 1.434
1683
+ - type: precision_at_1000
1684
+ value: 0.164
1685
+ - type: precision_at_3
1686
+ value: 32.001000000000005
1687
+ - type: precision_at_5
1688
+ value: 21.223
1689
+ - type: recall_at_1
1690
+ value: 31.81
1691
+ - type: recall_at_10
1692
+ value: 58.339
1693
+ - type: recall_at_100
1694
+ value: 71.708
1695
+ - type: recall_at_1000
1696
+ value: 81.85
1697
+ - type: recall_at_3
1698
+ value: 48.001
1699
+ - type: recall_at_5
1700
+ value: 53.059
1701
+ - task:
1702
+ type: Classification
1703
+ dataset:
1704
+ type: mteb/imdb
1705
+ name: MTEB ImdbClassification
1706
+ config: default
1707
+ split: test
1708
+ revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
1709
+ metrics:
1710
+ - type: accuracy
1711
+ value: 68.60640000000001
1712
+ - type: ap
1713
+ value: 62.84296904042086
1714
+ - type: f1
1715
+ value: 68.50643633327537
1716
+ - task:
1717
+ type: Reranking
1718
+ dataset:
1719
+ type: jinaai/miracl
1720
+ name: MTEB MIRACL
1721
+ config: default
1722
+ split: test
1723
+ revision: 8741c3b61cd36ed9ca1b3d4203543a41793239e2
1724
+ metrics:
1725
+ - type: map
1726
+ value: 64.29704335389768
1727
+ - type: mrr
1728
+ value: 72.11962197159565
1729
+ - task:
1730
+ type: Classification
1731
+ dataset:
1732
+ type: mteb/mtop_domain
1733
+ name: MTEB MTOPDomainClassification (en)
1734
+ config: en
1735
+ split: test
1736
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1737
+ metrics:
1738
+ - type: accuracy
1739
+ value: 89.3844049247606
1740
+ - type: f1
1741
+ value: 89.2124328528015
1742
+ - task:
1743
+ type: Classification
1744
+ dataset:
1745
+ type: mteb/mtop_domain
1746
+ name: MTEB MTOPDomainClassification (de)
1747
+ config: de
1748
+ split: test
1749
+ revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
1750
+ metrics:
1751
+ - type: accuracy
1752
+ value: 88.36855452240067
1753
+ - type: f1
1754
+ value: 87.35458822097442
1755
+ - task:
1756
+ type: Classification
1757
+ dataset:
1758
+ type: mteb/mtop_intent
1759
+ name: MTEB MTOPIntentClassification (en)
1760
+ config: en
1761
+ split: test
1762
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1763
+ metrics:
1764
+ - type: accuracy
1765
+ value: 66.48654810761514
1766
+ - type: f1
1767
+ value: 50.07229882504409
1768
+ - task:
1769
+ type: Classification
1770
+ dataset:
1771
+ type: mteb/mtop_intent
1772
+ name: MTEB MTOPIntentClassification (de)
1773
+ config: de
1774
+ split: test
1775
+ revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
1776
+ metrics:
1777
+ - type: accuracy
1778
+ value: 63.832065370526905
1779
+ - type: f1
1780
+ value: 46.283579383385806
1781
+ - task:
1782
+ type: Classification
1783
+ dataset:
1784
+ type: mteb/amazon_massive_intent
1785
+ name: MTEB MassiveIntentClassification (de)
1786
+ config: de
1787
+ split: test
1788
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1789
+ metrics:
1790
+ - type: accuracy
1791
+ value: 63.89038332212509
1792
+ - type: f1
1793
+ value: 61.86279849685129
1794
+ - task:
1795
+ type: Classification
1796
+ dataset:
1797
+ type: mteb/amazon_massive_intent
1798
+ name: MTEB MassiveIntentClassification (en)
1799
+ config: en
1800
+ split: test
1801
+ revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
1802
+ metrics:
1803
+ - type: accuracy
1804
+ value: 69.11230665770006
1805
+ - type: f1
1806
+ value: 67.44780095350535
1807
+ - task:
1808
+ type: Classification
1809
+ dataset:
1810
+ type: mteb/amazon_massive_scenario
1811
+ name: MTEB MassiveScenarioClassification (de)
1812
+ config: de
1813
+ split: test
1814
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1815
+ metrics:
1816
+ - type: accuracy
1817
+ value: 71.25084061869536
1818
+ - type: f1
1819
+ value: 71.43965023016408
1820
+ - task:
1821
+ type: Classification
1822
+ dataset:
1823
+ type: mteb/amazon_massive_scenario
1824
+ name: MTEB MassiveScenarioClassification (en)
1825
+ config: en
1826
+ split: test
1827
+ revision: 7d571f92784cd94a019292a1f45445077d0ef634
1828
+ metrics:
1829
+ - type: accuracy
1830
+ value: 73.73907195696032
1831
+ - type: f1
1832
+ value: 73.69920814839061
1833
+ - task:
1834
+ type: Clustering
1835
+ dataset:
1836
+ type: mteb/medrxiv-clustering-p2p
1837
+ name: MTEB MedrxivClusteringP2P
1838
+ config: default
1839
+ split: test
1840
+ revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
1841
+ metrics:
1842
+ - type: v_measure
1843
+ value: 31.32577306498249
1844
+ - task:
1845
+ type: Clustering
1846
+ dataset:
1847
+ type: mteb/medrxiv-clustering-s2s
1848
+ name: MTEB MedrxivClusteringS2S
1849
+ config: default
1850
+ split: test
1851
+ revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
1852
+ metrics:
1853
+ - type: v_measure
1854
+ value: 28.759349326367783
1855
+ - task:
1856
+ type: Reranking
1857
+ dataset:
1858
+ type: mteb/mind_small
1859
+ name: MTEB MindSmallReranking
1860
+ config: default
1861
+ split: test
1862
+ revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
1863
+ metrics:
1864
+ - type: map
1865
+ value: 30.401342674703425
1866
+ - type: mrr
1867
+ value: 31.384379585660987
1868
+ - task:
1869
+ type: Retrieval
1870
+ dataset:
1871
+ type: nfcorpus
1872
+ name: MTEB NFCorpus
1873
+ config: default
1874
+ split: test
1875
+ revision: None
1876
+ metrics:
1877
+ - type: map_at_1
1878
+ value: 4.855
1879
+ - type: map_at_10
1880
+ value: 10.01
1881
+ - type: map_at_100
1882
+ value: 12.461
1883
+ - type: map_at_1000
1884
+ value: 13.776
1885
+ - type: map_at_3
1886
+ value: 7.252
1887
+ - type: map_at_5
1888
+ value: 8.679
1889
+ - type: mrr_at_1
1890
+ value: 41.176
1891
+ - type: mrr_at_10
1892
+ value: 49.323
1893
+ - type: mrr_at_100
1894
+ value: 49.954
1895
+ - type: mrr_at_1000
1896
+ value: 49.997
1897
+ - type: mrr_at_3
1898
+ value: 46.904
1899
+ - type: mrr_at_5
1900
+ value: 48.375
1901
+ - type: ndcg_at_1
1902
+ value: 39.318999999999996
1903
+ - type: ndcg_at_10
1904
+ value: 28.607
1905
+ - type: ndcg_at_100
1906
+ value: 26.554
1907
+ - type: ndcg_at_1000
1908
+ value: 35.731
1909
+ - type: ndcg_at_3
1910
+ value: 32.897999999999996
1911
+ - type: ndcg_at_5
1912
+ value: 31.53
1913
+ - type: precision_at_1
1914
+ value: 41.176
1915
+ - type: precision_at_10
1916
+ value: 20.867
1917
+ - type: precision_at_100
1918
+ value: 6.796
1919
+ - type: precision_at_1000
1920
+ value: 1.983
1921
+ - type: precision_at_3
1922
+ value: 30.547
1923
+ - type: precision_at_5
1924
+ value: 27.245
1925
+ - type: recall_at_1
1926
+ value: 4.855
1927
+ - type: recall_at_10
1928
+ value: 14.08
1929
+ - type: recall_at_100
1930
+ value: 28.188000000000002
1931
+ - type: recall_at_1000
1932
+ value: 60.07900000000001
1933
+ - type: recall_at_3
1934
+ value: 7.947
1935
+ - type: recall_at_5
1936
+ value: 10.786
1937
+ - task:
1938
+ type: Retrieval
1939
+ dataset:
1940
+ type: nq
1941
+ name: MTEB NQ
1942
+ config: default
1943
+ split: test
1944
+ revision: None
1945
+ metrics:
1946
+ - type: map_at_1
1947
+ value: 26.906999999999996
1948
+ - type: map_at_10
1949
+ value: 41.147
1950
+ - type: map_at_100
1951
+ value: 42.269
1952
+ - type: map_at_1000
1953
+ value: 42.308
1954
+ - type: map_at_3
1955
+ value: 36.638999999999996
1956
+ - type: map_at_5
1957
+ value: 39.285
1958
+ - type: mrr_at_1
1959
+ value: 30.359
1960
+ - type: mrr_at_10
1961
+ value: 43.607
1962
+ - type: mrr_at_100
1963
+ value: 44.454
1964
+ - type: mrr_at_1000
1965
+ value: 44.481
1966
+ - type: mrr_at_3
1967
+ value: 39.644
1968
+ - type: mrr_at_5
1969
+ value: 42.061
1970
+ - type: ndcg_at_1
1971
+ value: 30.330000000000002
1972
+ - type: ndcg_at_10
1973
+ value: 48.899
1974
+ - type: ndcg_at_100
1975
+ value: 53.612
1976
+ - type: ndcg_at_1000
1977
+ value: 54.51200000000001
1978
+ - type: ndcg_at_3
1979
+ value: 40.262
1980
+ - type: ndcg_at_5
1981
+ value: 44.787
1982
+ - type: precision_at_1
1983
+ value: 30.330000000000002
1984
+ - type: precision_at_10
1985
+ value: 8.323
1986
+ - type: precision_at_100
1987
+ value: 1.0959999999999999
1988
+ - type: precision_at_1000
1989
+ value: 0.11800000000000001
1990
+ - type: precision_at_3
1991
+ value: 18.395
1992
+ - type: precision_at_5
1993
+ value: 13.627
1994
+ - type: recall_at_1
1995
+ value: 26.906999999999996
1996
+ - type: recall_at_10
1997
+ value: 70.215
1998
+ - type: recall_at_100
1999
+ value: 90.61200000000001
2000
+ - type: recall_at_1000
2001
+ value: 97.294
2002
+ - type: recall_at_3
2003
+ value: 47.784
2004
+ - type: recall_at_5
2005
+ value: 58.251
2006
+ - task:
2007
+ type: PairClassification
2008
+ dataset:
2009
+ type: paws-x
2010
+ name: MTEB PawsX
2011
+ config: default
2012
+ split: test
2013
+ revision: 8a04d940a42cd40658986fdd8e3da561533a3646
2014
+ metrics:
2015
+ - type: cos_sim_accuracy
2016
+ value: 60.5
2017
+ - type: cos_sim_ap
2018
+ value: 57.606096528877494
2019
+ - type: cos_sim_f1
2020
+ value: 62.24240307369892
2021
+ - type: cos_sim_precision
2022
+ value: 45.27439024390244
2023
+ - type: cos_sim_recall
2024
+ value: 99.55307262569832
2025
+ - type: dot_accuracy
2026
+ value: 57.699999999999996
2027
+ - type: dot_ap
2028
+ value: 51.289351057160616
2029
+ - type: dot_f1
2030
+ value: 62.25953130465197
2031
+ - type: dot_precision
2032
+ value: 45.31568228105906
2033
+ - type: dot_recall
2034
+ value: 99.4413407821229
2035
+ - type: euclidean_accuracy
2036
+ value: 60.45
2037
+ - type: euclidean_ap
2038
+ value: 57.616461421424034
2039
+ - type: euclidean_f1
2040
+ value: 62.313697657913416
2041
+ - type: euclidean_precision
2042
+ value: 45.657826313052524
2043
+ - type: euclidean_recall
2044
+ value: 98.10055865921787
2045
+ - type: manhattan_accuracy
2046
+ value: 60.3
2047
+ - type: manhattan_ap
2048
+ value: 57.580565271667325
2049
+ - type: manhattan_f1
2050
+ value: 62.24240307369892
2051
+ - type: manhattan_precision
2052
+ value: 45.27439024390244
2053
+ - type: manhattan_recall
2054
+ value: 99.55307262569832
2055
+ - type: max_accuracy
2056
+ value: 60.5
2057
+ - type: max_ap
2058
+ value: 57.616461421424034
2059
+ - type: max_f1
2060
+ value: 62.313697657913416
2061
+ - task:
2062
+ type: Retrieval
2063
+ dataset:
2064
+ type: quora
2065
+ name: MTEB QuoraRetrieval
2066
+ config: default
2067
+ split: test
2068
+ revision: None
2069
+ metrics:
2070
+ - type: map_at_1
2071
+ value: 70.21300000000001
2072
+ - type: map_at_10
2073
+ value: 84.136
2074
+ - type: map_at_100
2075
+ value: 84.796
2076
+ - type: map_at_1000
2077
+ value: 84.812
2078
+ - type: map_at_3
2079
+ value: 81.182
2080
+ - type: map_at_5
2081
+ value: 83.027
2082
+ - type: mrr_at_1
2083
+ value: 80.91000000000001
2084
+ - type: mrr_at_10
2085
+ value: 87.155
2086
+ - type: mrr_at_100
2087
+ value: 87.27000000000001
2088
+ - type: mrr_at_1000
2089
+ value: 87.271
2090
+ - type: mrr_at_3
2091
+ value: 86.158
2092
+ - type: mrr_at_5
2093
+ value: 86.828
2094
+ - type: ndcg_at_1
2095
+ value: 80.88
2096
+ - type: ndcg_at_10
2097
+ value: 87.926
2098
+ - type: ndcg_at_100
2099
+ value: 89.223
2100
+ - type: ndcg_at_1000
2101
+ value: 89.321
2102
+ - type: ndcg_at_3
2103
+ value: 85.036
2104
+ - type: ndcg_at_5
2105
+ value: 86.614
2106
+ - type: precision_at_1
2107
+ value: 80.88
2108
+ - type: precision_at_10
2109
+ value: 13.350000000000001
2110
+ - type: precision_at_100
2111
+ value: 1.5310000000000001
2112
+ - type: precision_at_1000
2113
+ value: 0.157
2114
+ - type: precision_at_3
2115
+ value: 37.173
2116
+ - type: precision_at_5
2117
+ value: 24.476
2118
+ - type: recall_at_1
2119
+ value: 70.21300000000001
2120
+ - type: recall_at_10
2121
+ value: 95.12
2122
+ - type: recall_at_100
2123
+ value: 99.535
2124
+ - type: recall_at_1000
2125
+ value: 99.977
2126
+ - type: recall_at_3
2127
+ value: 86.833
2128
+ - type: recall_at_5
2129
+ value: 91.26100000000001
2130
+ - task:
2131
+ type: Clustering
2132
+ dataset:
2133
+ type: mteb/reddit-clustering
2134
+ name: MTEB RedditClustering
2135
+ config: default
2136
+ split: test
2137
+ revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
2138
+ metrics:
2139
+ - type: v_measure
2140
+ value: 47.754688783184875
2141
+ - task:
2142
+ type: Clustering
2143
+ dataset:
2144
+ type: mteb/reddit-clustering-p2p
2145
+ name: MTEB RedditClusteringP2P
2146
+ config: default
2147
+ split: test
2148
+ revision: 282350215ef01743dc01b456c7f5241fa8937f16
2149
+ metrics:
2150
+ - type: v_measure
2151
+ value: 54.875736374329364
2152
+ - task:
2153
+ type: Retrieval
2154
+ dataset:
2155
+ type: scidocs
2156
+ name: MTEB SCIDOCS
2157
+ config: default
2158
+ split: test
2159
+ revision: None
2160
+ metrics:
2161
+ - type: map_at_1
2162
+ value: 3.773
2163
+ - type: map_at_10
2164
+ value: 9.447
2165
+ - type: map_at_100
2166
+ value: 11.1
2167
+ - type: map_at_1000
2168
+ value: 11.37
2169
+ - type: map_at_3
2170
+ value: 6.787
2171
+ - type: map_at_5
2172
+ value: 8.077
2173
+ - type: mrr_at_1
2174
+ value: 18.5
2175
+ - type: mrr_at_10
2176
+ value: 28.227000000000004
2177
+ - type: mrr_at_100
2178
+ value: 29.445
2179
+ - type: mrr_at_1000
2180
+ value: 29.515
2181
+ - type: mrr_at_3
2182
+ value: 25.2
2183
+ - type: mrr_at_5
2184
+ value: 27.055
2185
+ - type: ndcg_at_1
2186
+ value: 18.5
2187
+ - type: ndcg_at_10
2188
+ value: 16.29
2189
+ - type: ndcg_at_100
2190
+ value: 23.250999999999998
2191
+ - type: ndcg_at_1000
2192
+ value: 28.445999999999998
2193
+ - type: ndcg_at_3
2194
+ value: 15.376000000000001
2195
+ - type: ndcg_at_5
2196
+ value: 13.528
2197
+ - type: precision_at_1
2198
+ value: 18.5
2199
+ - type: precision_at_10
2200
+ value: 8.51
2201
+ - type: precision_at_100
2202
+ value: 1.855
2203
+ - type: precision_at_1000
2204
+ value: 0.311
2205
+ - type: precision_at_3
2206
+ value: 14.533
2207
+ - type: precision_at_5
2208
+ value: 12.0
2209
+ - type: recall_at_1
2210
+ value: 3.773
2211
+ - type: recall_at_10
2212
+ value: 17.282
2213
+ - type: recall_at_100
2214
+ value: 37.645
2215
+ - type: recall_at_1000
2216
+ value: 63.138000000000005
2217
+ - type: recall_at_3
2218
+ value: 8.853
2219
+ - type: recall_at_5
2220
+ value: 12.168
2221
+ - task:
2222
+ type: STS
2223
+ dataset:
2224
+ type: mteb/sickr-sts
2225
+ name: MTEB SICK-R
2226
+ config: default
2227
+ split: test
2228
+ revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
2229
+ metrics:
2230
+ - type: cos_sim_pearson
2231
+ value: 85.32789517976525
2232
+ - type: cos_sim_spearman
2233
+ value: 80.32750384145629
2234
+ - type: euclidean_pearson
2235
+ value: 81.5025131452508
2236
+ - type: euclidean_spearman
2237
+ value: 80.24797115147175
2238
+ - type: manhattan_pearson
2239
+ value: 81.51634463412002
2240
+ - type: manhattan_spearman
2241
+ value: 80.24614721495055
2242
+ - task:
2243
+ type: STS
2244
+ dataset:
2245
+ type: mteb/sts12-sts
2246
+ name: MTEB STS12
2247
+ config: default
2248
+ split: test
2249
+ revision: a0d554a64d88156834ff5ae9920b964011b16384
2250
+ metrics:
2251
+ - type: cos_sim_pearson
2252
+ value: 88.47050448992432
2253
+ - type: cos_sim_spearman
2254
+ value: 80.58919997743621
2255
+ - type: euclidean_pearson
2256
+ value: 85.83258918113664
2257
+ - type: euclidean_spearman
2258
+ value: 80.97441389240902
2259
+ - type: manhattan_pearson
2260
+ value: 85.7798262013878
2261
+ - type: manhattan_spearman
2262
+ value: 80.97208703064196
2263
+ - task:
2264
+ type: STS
2265
+ dataset:
2266
+ type: mteb/sts13-sts
2267
+ name: MTEB STS13
2268
+ config: default
2269
+ split: test
2270
+ revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
2271
+ metrics:
2272
+ - type: cos_sim_pearson
2273
+ value: 85.95341439711532
2274
+ - type: cos_sim_spearman
2275
+ value: 86.59127484634989
2276
+ - type: euclidean_pearson
2277
+ value: 85.57850603454227
2278
+ - type: euclidean_spearman
2279
+ value: 86.47130477363419
2280
+ - type: manhattan_pearson
2281
+ value: 85.59387925447652
2282
+ - type: manhattan_spearman
2283
+ value: 86.50665427391583
2284
+ - task:
2285
+ type: STS
2286
+ dataset:
2287
+ type: mteb/sts14-sts
2288
+ name: MTEB STS14
2289
+ config: default
2290
+ split: test
2291
+ revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
2292
+ metrics:
2293
+ - type: cos_sim_pearson
2294
+ value: 85.39810909161844
2295
+ - type: cos_sim_spearman
2296
+ value: 82.98595295546008
2297
+ - type: euclidean_pearson
2298
+ value: 84.04681129969951
2299
+ - type: euclidean_spearman
2300
+ value: 82.98197460689866
2301
+ - type: manhattan_pearson
2302
+ value: 83.9918798171185
2303
+ - type: manhattan_spearman
2304
+ value: 82.91148131768082
2305
+ - task:
2306
+ type: STS
2307
+ dataset:
2308
+ type: mteb/sts15-sts
2309
+ name: MTEB STS15
2310
+ config: default
2311
+ split: test
2312
+ revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
2313
+ metrics:
2314
+ - type: cos_sim_pearson
2315
+ value: 88.02072712147692
2316
+ - type: cos_sim_spearman
2317
+ value: 88.78821332623012
2318
+ - type: euclidean_pearson
2319
+ value: 88.12132045572747
2320
+ - type: euclidean_spearman
2321
+ value: 88.74273451067364
2322
+ - type: manhattan_pearson
2323
+ value: 88.05431550059166
2324
+ - type: manhattan_spearman
2325
+ value: 88.67610233020723
2326
+ - task:
2327
+ type: STS
2328
+ dataset:
2329
+ type: mteb/sts16-sts
2330
+ name: MTEB STS16
2331
+ config: default
2332
+ split: test
2333
+ revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
2334
+ metrics:
2335
+ - type: cos_sim_pearson
2336
+ value: 82.96134704624787
2337
+ - type: cos_sim_spearman
2338
+ value: 84.44062976314666
2339
+ - type: euclidean_pearson
2340
+ value: 84.03642536310323
2341
+ - type: euclidean_spearman
2342
+ value: 84.4535014579785
2343
+ - type: manhattan_pearson
2344
+ value: 83.92874228901483
2345
+ - type: manhattan_spearman
2346
+ value: 84.33634314951631
2347
+ - task:
2348
+ type: STS
2349
+ dataset:
2350
+ type: mteb/sts17-crosslingual-sts
2351
+ name: MTEB STS17 (en-de)
2352
+ config: en-de
2353
+ split: test
2354
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2355
+ metrics:
2356
+ - type: cos_sim_pearson
2357
+ value: 87.3154168064887
2358
+ - type: cos_sim_spearman
2359
+ value: 86.72393652571682
2360
+ - type: euclidean_pearson
2361
+ value: 86.04193246174164
2362
+ - type: euclidean_spearman
2363
+ value: 86.30482896608093
2364
+ - type: manhattan_pearson
2365
+ value: 85.95524084651859
2366
+ - type: manhattan_spearman
2367
+ value: 86.06031431994282
2368
+ - task:
2369
+ type: STS
2370
+ dataset:
2371
+ type: mteb/sts17-crosslingual-sts
2372
+ name: MTEB STS17 (en-en)
2373
+ config: en-en
2374
+ split: test
2375
+ revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
2376
+ metrics:
2377
+ - type: cos_sim_pearson
2378
+ value: 89.91079682750804
2379
+ - type: cos_sim_spearman
2380
+ value: 89.30961836617064
2381
+ - type: euclidean_pearson
2382
+ value: 88.86249564158628
2383
+ - type: euclidean_spearman
2384
+ value: 89.04772899592396
2385
+ - type: manhattan_pearson
2386
+ value: 88.85579791315043
2387
+ - type: manhattan_spearman
2388
+ value: 88.94190462541333
2389
+ - task:
2390
+ type: STS
2391
+ dataset:
2392
+ type: mteb/sts22-crosslingual-sts
2393
+ name: MTEB STS22 (en)
2394
+ config: en
2395
+ split: test
2396
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2397
+ metrics:
2398
+ - type: cos_sim_pearson
2399
+ value: 67.00558145551088
2400
+ - type: cos_sim_spearman
2401
+ value: 67.96601170393878
2402
+ - type: euclidean_pearson
2403
+ value: 67.87627043214336
2404
+ - type: euclidean_spearman
2405
+ value: 66.76402572303859
2406
+ - type: manhattan_pearson
2407
+ value: 67.88306560555452
2408
+ - type: manhattan_spearman
2409
+ value: 66.6273862035506
2410
+ - task:
2411
+ type: STS
2412
+ dataset:
2413
+ type: mteb/sts22-crosslingual-sts
2414
+ name: MTEB STS22 (de)
2415
+ config: de
2416
+ split: test
2417
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2418
+ metrics:
2419
+ - type: cos_sim_pearson
2420
+ value: 50.83759332748726
2421
+ - type: cos_sim_spearman
2422
+ value: 59.066344562858006
2423
+ - type: euclidean_pearson
2424
+ value: 50.08955848154131
2425
+ - type: euclidean_spearman
2426
+ value: 58.36517305855221
2427
+ - type: manhattan_pearson
2428
+ value: 50.05257267223111
2429
+ - type: manhattan_spearman
2430
+ value: 58.37570252804986
2431
+ - task:
2432
+ type: STS
2433
+ dataset:
2434
+ type: mteb/sts22-crosslingual-sts
2435
+ name: MTEB STS22 (de-en)
2436
+ config: de-en
2437
+ split: test
2438
+ revision: 6d1ba47164174a496b7fa5d3569dae26a6813b80
2439
+ metrics:
2440
+ - type: cos_sim_pearson
2441
+ value: 59.22749007956492
2442
+ - type: cos_sim_spearman
2443
+ value: 55.97282077657827
2444
+ - type: euclidean_pearson
2445
+ value: 62.10661533695752
2446
+ - type: euclidean_spearman
2447
+ value: 53.62780854854067
2448
+ - type: manhattan_pearson
2449
+ value: 62.37138085709719
2450
+ - type: manhattan_spearman
2451
+ value: 54.17556356828155
2452
+ - task:
2453
+ type: STS
2454
+ dataset:
2455
+ type: mteb/stsbenchmark-sts
2456
+ name: MTEB STSBenchmark
2457
+ config: default
2458
+ split: test
2459
+ revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
2460
+ metrics:
2461
+ - type: cos_sim_pearson
2462
+ value: 87.91145397065878
2463
+ - type: cos_sim_spearman
2464
+ value: 88.13960018389005
2465
+ - type: euclidean_pearson
2466
+ value: 87.67618876224006
2467
+ - type: euclidean_spearman
2468
+ value: 87.99119480810556
2469
+ - type: manhattan_pearson
2470
+ value: 87.67920297334753
2471
+ - type: manhattan_spearman
2472
+ value: 87.99113250064492
2473
+ - task:
2474
+ type: Reranking
2475
+ dataset:
2476
+ type: mteb/scidocs-reranking
2477
+ name: MTEB SciDocsRR
2478
+ config: default
2479
+ split: test
2480
+ revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
2481
+ metrics:
2482
+ - type: map
2483
+ value: 78.09133563707582
2484
+ - type: mrr
2485
+ value: 93.2415288052543
2486
+ - task:
2487
+ type: Retrieval
2488
+ dataset:
2489
+ type: scifact
2490
+ name: MTEB SciFact
2491
+ config: default
2492
+ split: test
2493
+ revision: None
2494
+ metrics:
2495
+ - type: map_at_1
2496
+ value: 47.760999999999996
2497
+ - type: map_at_10
2498
+ value: 56.424
2499
+ - type: map_at_100
2500
+ value: 57.24399999999999
2501
+ - type: map_at_1000
2502
+ value: 57.278
2503
+ - type: map_at_3
2504
+ value: 53.68000000000001
2505
+ - type: map_at_5
2506
+ value: 55.442
2507
+ - type: mrr_at_1
2508
+ value: 50.666999999999994
2509
+ - type: mrr_at_10
2510
+ value: 58.012
2511
+ - type: mrr_at_100
2512
+ value: 58.736
2513
+ - type: mrr_at_1000
2514
+ value: 58.769000000000005
2515
+ - type: mrr_at_3
2516
+ value: 56.056
2517
+ - type: mrr_at_5
2518
+ value: 57.321999999999996
2519
+ - type: ndcg_at_1
2520
+ value: 50.666999999999994
2521
+ - type: ndcg_at_10
2522
+ value: 60.67700000000001
2523
+ - type: ndcg_at_100
2524
+ value: 64.513
2525
+ - type: ndcg_at_1000
2526
+ value: 65.62400000000001
2527
+ - type: ndcg_at_3
2528
+ value: 56.186
2529
+ - type: ndcg_at_5
2530
+ value: 58.692
2531
+ - type: precision_at_1
2532
+ value: 50.666999999999994
2533
+ - type: precision_at_10
2534
+ value: 8.200000000000001
2535
+ - type: precision_at_100
2536
+ value: 1.023
2537
+ - type: precision_at_1000
2538
+ value: 0.11199999999999999
2539
+ - type: precision_at_3
2540
+ value: 21.889
2541
+ - type: precision_at_5
2542
+ value: 14.866999999999999
2543
+ - type: recall_at_1
2544
+ value: 47.760999999999996
2545
+ - type: recall_at_10
2546
+ value: 72.006
2547
+ - type: recall_at_100
2548
+ value: 89.767
2549
+ - type: recall_at_1000
2550
+ value: 98.833
2551
+ - type: recall_at_3
2552
+ value: 60.211000000000006
2553
+ - type: recall_at_5
2554
+ value: 66.3
2555
+ - task:
2556
+ type: PairClassification
2557
+ dataset:
2558
+ type: mteb/sprintduplicatequestions-pairclassification
2559
+ name: MTEB SprintDuplicateQuestions
2560
+ config: default
2561
+ split: test
2562
+ revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
2563
+ metrics:
2564
+ - type: cos_sim_accuracy
2565
+ value: 99.79009900990098
2566
+ - type: cos_sim_ap
2567
+ value: 94.86690691995835
2568
+ - type: cos_sim_f1
2569
+ value: 89.37875751503007
2570
+ - type: cos_sim_precision
2571
+ value: 89.5582329317269
2572
+ - type: cos_sim_recall
2573
+ value: 89.2
2574
+ - type: dot_accuracy
2575
+ value: 99.76336633663367
2576
+ - type: dot_ap
2577
+ value: 94.26453740761586
2578
+ - type: dot_f1
2579
+ value: 88.00783162016641
2580
+ - type: dot_precision
2581
+ value: 86.19367209971237
2582
+ - type: dot_recall
2583
+ value: 89.9
2584
+ - type: euclidean_accuracy
2585
+ value: 99.7940594059406
2586
+ - type: euclidean_ap
2587
+ value: 94.85459757524379
2588
+ - type: euclidean_f1
2589
+ value: 89.62779156327544
2590
+ - type: euclidean_precision
2591
+ value: 88.96551724137932
2592
+ - type: euclidean_recall
2593
+ value: 90.3
2594
+ - type: manhattan_accuracy
2595
+ value: 99.79009900990098
2596
+ - type: manhattan_ap
2597
+ value: 94.76971336654465
2598
+ - type: manhattan_f1
2599
+ value: 89.35323383084577
2600
+ - type: manhattan_precision
2601
+ value: 88.91089108910892
2602
+ - type: manhattan_recall
2603
+ value: 89.8
2604
+ - type: max_accuracy
2605
+ value: 99.7940594059406
2606
+ - type: max_ap
2607
+ value: 94.86690691995835
2608
+ - type: max_f1
2609
+ value: 89.62779156327544
2610
+ - task:
2611
+ type: Clustering
2612
+ dataset:
2613
+ type: mteb/stackexchange-clustering
2614
+ name: MTEB StackExchangeClustering
2615
+ config: default
2616
+ split: test
2617
+ revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
2618
+ metrics:
2619
+ - type: v_measure
2620
+ value: 55.38197670064987
2621
+ - task:
2622
+ type: Clustering
2623
+ dataset:
2624
+ type: mteb/stackexchange-clustering-p2p
2625
+ name: MTEB StackExchangeClusteringP2P
2626
+ config: default
2627
+ split: test
2628
+ revision: 815ca46b2622cec33ccafc3735d572c266efdb44
2629
+ metrics:
2630
+ - type: v_measure
2631
+ value: 33.08330158937971
2632
+ - task:
2633
+ type: Reranking
2634
+ dataset:
2635
+ type: mteb/stackoverflowdupquestions-reranking
2636
+ name: MTEB StackOverflowDupQuestions
2637
+ config: default
2638
+ split: test
2639
+ revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
2640
+ metrics:
2641
+ - type: map
2642
+ value: 49.50367079063226
2643
+ - type: mrr
2644
+ value: 50.30444943128768
2645
+ - task:
2646
+ type: Summarization
2647
+ dataset:
2648
+ type: mteb/summeval
2649
+ name: MTEB SummEval
2650
+ config: default
2651
+ split: test
2652
+ revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
2653
+ metrics:
2654
+ - type: cos_sim_pearson
2655
+ value: 30.37739520909561
2656
+ - type: cos_sim_spearman
2657
+ value: 31.548500943973913
2658
+ - type: dot_pearson
2659
+ value: 29.983610104303
2660
+ - type: dot_spearman
2661
+ value: 29.90185869098618
2662
+ - task:
2663
+ type: Retrieval
2664
+ dataset:
2665
+ type: trec-covid
2666
+ name: MTEB TRECCOVID
2667
+ config: default
2668
+ split: test
2669
+ revision: None
2670
+ metrics:
2671
+ - type: map_at_1
2672
+ value: 0.198
2673
+ - type: map_at_10
2674
+ value: 1.5810000000000002
2675
+ - type: map_at_100
2676
+ value: 9.064
2677
+ - type: map_at_1000
2678
+ value: 22.161
2679
+ - type: map_at_3
2680
+ value: 0.536
2681
+ - type: map_at_5
2682
+ value: 0.8370000000000001
2683
+ - type: mrr_at_1
2684
+ value: 80.0
2685
+ - type: mrr_at_10
2686
+ value: 86.75
2687
+ - type: mrr_at_100
2688
+ value: 86.799
2689
+ - type: mrr_at_1000
2690
+ value: 86.799
2691
+ - type: mrr_at_3
2692
+ value: 85.0
2693
+ - type: mrr_at_5
2694
+ value: 86.5
2695
+ - type: ndcg_at_1
2696
+ value: 73.0
2697
+ - type: ndcg_at_10
2698
+ value: 65.122
2699
+ - type: ndcg_at_100
2700
+ value: 51.853
2701
+ - type: ndcg_at_1000
2702
+ value: 47.275
2703
+ - type: ndcg_at_3
2704
+ value: 66.274
2705
+ - type: ndcg_at_5
2706
+ value: 64.826
2707
+ - type: precision_at_1
2708
+ value: 80.0
2709
+ - type: precision_at_10
2710
+ value: 70.19999999999999
2711
+ - type: precision_at_100
2712
+ value: 53.480000000000004
2713
+ - type: precision_at_1000
2714
+ value: 20.946
2715
+ - type: precision_at_3
2716
+ value: 71.333
2717
+ - type: precision_at_5
2718
+ value: 70.0
2719
+ - type: recall_at_1
2720
+ value: 0.198
2721
+ - type: recall_at_10
2722
+ value: 1.884
2723
+ - type: recall_at_100
2724
+ value: 12.57
2725
+ - type: recall_at_1000
2726
+ value: 44.208999999999996
2727
+ - type: recall_at_3
2728
+ value: 0.5890000000000001
2729
+ - type: recall_at_5
2730
+ value: 0.95
2731
+ - task:
2732
+ type: Clustering
2733
+ dataset:
2734
+ type: slvnwhrl/tenkgnad-clustering-p2p
2735
+ name: MTEB TenKGnadClusteringP2P
2736
+ config: default
2737
+ split: test
2738
+ revision: 5c59e41555244b7e45c9a6be2d720ab4bafae558
2739
+ metrics:
2740
+ - type: v_measure
2741
+ value: 42.84199261133083
2742
+ - task:
2743
+ type: Clustering
2744
+ dataset:
2745
+ type: slvnwhrl/tenkgnad-clustering-s2s
2746
+ name: MTEB TenKGnadClusteringS2S
2747
+ config: default
2748
+ split: test
2749
+ revision: 6cddbe003f12b9b140aec477b583ac4191f01786
2750
+ metrics:
2751
+ - type: v_measure
2752
+ value: 23.689557114798838
2753
+ - task:
2754
+ type: Retrieval
2755
+ dataset:
2756
+ type: webis-touche2020
2757
+ name: MTEB Touche2020
2758
+ config: default
2759
+ split: test
2760
+ revision: None
2761
+ metrics:
2762
+ - type: map_at_1
2763
+ value: 1.941
2764
+ - type: map_at_10
2765
+ value: 8.222
2766
+ - type: map_at_100
2767
+ value: 14.277999999999999
2768
+ - type: map_at_1000
2769
+ value: 15.790000000000001
2770
+ - type: map_at_3
2771
+ value: 4.4670000000000005
2772
+ - type: map_at_5
2773
+ value: 5.762
2774
+ - type: mrr_at_1
2775
+ value: 24.490000000000002
2776
+ - type: mrr_at_10
2777
+ value: 38.784
2778
+ - type: mrr_at_100
2779
+ value: 39.724
2780
+ - type: mrr_at_1000
2781
+ value: 39.724
2782
+ - type: mrr_at_3
2783
+ value: 33.333
2784
+ - type: mrr_at_5
2785
+ value: 37.415
2786
+ - type: ndcg_at_1
2787
+ value: 22.448999999999998
2788
+ - type: ndcg_at_10
2789
+ value: 21.026
2790
+ - type: ndcg_at_100
2791
+ value: 33.721000000000004
2792
+ - type: ndcg_at_1000
2793
+ value: 45.045
2794
+ - type: ndcg_at_3
2795
+ value: 20.053
2796
+ - type: ndcg_at_5
2797
+ value: 20.09
2798
+ - type: precision_at_1
2799
+ value: 24.490000000000002
2800
+ - type: precision_at_10
2801
+ value: 19.796
2802
+ - type: precision_at_100
2803
+ value: 7.469
2804
+ - type: precision_at_1000
2805
+ value: 1.48
2806
+ - type: precision_at_3
2807
+ value: 21.769
2808
+ - type: precision_at_5
2809
+ value: 21.224
2810
+ - type: recall_at_1
2811
+ value: 1.941
2812
+ - type: recall_at_10
2813
+ value: 14.915999999999999
2814
+ - type: recall_at_100
2815
+ value: 46.155
2816
+ - type: recall_at_1000
2817
+ value: 80.664
2818
+ - type: recall_at_3
2819
+ value: 5.629
2820
+ - type: recall_at_5
2821
+ value: 8.437
2822
+ - task:
2823
+ type: Classification
2824
+ dataset:
2825
+ type: mteb/toxic_conversations_50k
2826
+ name: MTEB ToxicConversationsClassification
2827
+ config: default
2828
+ split: test
2829
+ revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
2830
+ metrics:
2831
+ - type: accuracy
2832
+ value: 69.64800000000001
2833
+ - type: ap
2834
+ value: 12.914826731261094
2835
+ - type: f1
2836
+ value: 53.05213503422915
2837
+ - task:
2838
+ type: Classification
2839
+ dataset:
2840
+ type: mteb/tweet_sentiment_extraction
2841
+ name: MTEB TweetSentimentExtractionClassification
2842
+ config: default
2843
+ split: test
2844
+ revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
2845
+ metrics:
2846
+ - type: accuracy
2847
+ value: 60.427277872099594
2848
+ - type: f1
2849
+ value: 60.78292007556828
2850
+ - task:
2851
+ type: Clustering
2852
+ dataset:
2853
+ type: mteb/twentynewsgroups-clustering
2854
+ name: MTEB TwentyNewsgroupsClustering
2855
+ config: default
2856
+ split: test
2857
+ revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
2858
+ metrics:
2859
+ - type: v_measure
2860
+ value: 40.48134168406559
2861
+ - task:
2862
+ type: PairClassification
2863
+ dataset:
2864
+ type: mteb/twittersemeval2015-pairclassification
2865
+ name: MTEB TwitterSemEval2015
2866
+ config: default
2867
+ split: test
2868
+ revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
2869
+ metrics:
2870
+ - type: cos_sim_accuracy
2871
+ value: 84.79465935506944
2872
+ - type: cos_sim_ap
2873
+ value: 70.24589055290592
2874
+ - type: cos_sim_f1
2875
+ value: 65.0994575045208
2876
+ - type: cos_sim_precision
2877
+ value: 63.76518218623482
2878
+ - type: cos_sim_recall
2879
+ value: 66.49076517150397
2880
+ - type: dot_accuracy
2881
+ value: 84.63968528342374
2882
+ - type: dot_ap
2883
+ value: 69.84683095084355
2884
+ - type: dot_f1
2885
+ value: 64.50606169727523
2886
+ - type: dot_precision
2887
+ value: 59.1719885487778
2888
+ - type: dot_recall
2889
+ value: 70.89709762532982
2890
+ - type: euclidean_accuracy
2891
+ value: 84.76485664898374
2892
+ - type: euclidean_ap
2893
+ value: 70.20556438685551
2894
+ - type: euclidean_f1
2895
+ value: 65.06796614516543
2896
+ - type: euclidean_precision
2897
+ value: 63.29840319361277
2898
+ - type: euclidean_recall
2899
+ value: 66.93931398416886
2900
+ - type: manhattan_accuracy
2901
+ value: 84.72313286046374
2902
+ - type: manhattan_ap
2903
+ value: 70.17151475534308
2904
+ - type: manhattan_f1
2905
+ value: 65.31379180759113
2906
+ - type: manhattan_precision
2907
+ value: 62.17505366086334
2908
+ - type: manhattan_recall
2909
+ value: 68.7862796833773
2910
+ - type: max_accuracy
2911
+ value: 84.79465935506944
2912
+ - type: max_ap
2913
+ value: 70.24589055290592
2914
+ - type: max_f1
2915
+ value: 65.31379180759113
2916
+ - task:
2917
+ type: PairClassification
2918
+ dataset:
2919
+ type: mteb/twitterurlcorpus-pairclassification
2920
+ name: MTEB TwitterURLCorpus
2921
+ config: default
2922
+ split: test
2923
+ revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
2924
+ metrics:
2925
+ - type: cos_sim_accuracy
2926
+ value: 88.95874568246207
2927
+ - type: cos_sim_ap
2928
+ value: 85.82517548264127
2929
+ - type: cos_sim_f1
2930
+ value: 78.22288041466125
2931
+ - type: cos_sim_precision
2932
+ value: 75.33875338753387
2933
+ - type: cos_sim_recall
2934
+ value: 81.33661841700031
2935
+ - type: dot_accuracy
2936
+ value: 88.836496293709
2937
+ - type: dot_ap
2938
+ value: 85.53430720252186
2939
+ - type: dot_f1
2940
+ value: 78.10616085869725
2941
+ - type: dot_precision
2942
+ value: 74.73269555430501
2943
+ - type: dot_recall
2944
+ value: 81.79858330766862
2945
+ - type: euclidean_accuracy
2946
+ value: 88.92769821865176
2947
+ - type: euclidean_ap
2948
+ value: 85.65904346964223
2949
+ - type: euclidean_f1
2950
+ value: 77.98774074208407
2951
+ - type: euclidean_precision
2952
+ value: 73.72282795035315
2953
+ - type: euclidean_recall
2954
+ value: 82.77640899291654
2955
+ - type: manhattan_accuracy
2956
+ value: 88.86366282454303
2957
+ - type: manhattan_ap
2958
+ value: 85.61599642231819
2959
+ - type: manhattan_f1
2960
+ value: 78.01480509061737
2961
+ - type: manhattan_precision
2962
+ value: 74.10460685833044
2963
+ - type: manhattan_recall
2964
+ value: 82.36064059131506
2965
+ - type: max_accuracy
2966
+ value: 88.95874568246207
2967
+ - type: max_ap
2968
+ value: 85.82517548264127
2969
+ - type: max_f1
2970
+ value: 78.22288041466125
2971
+ - task:
2972
+ type: Retrieval
2973
+ dataset:
2974
+ type: None
2975
+ name: MTEB WikiCLIR
2976
+ config: default
2977
+ split: test
2978
+ revision: None
2979
+ metrics:
2980
+ - type: map_at_1
2981
+ value: 3.9539999999999997
2982
+ - type: map_at_10
2983
+ value: 7.407
2984
+ - type: map_at_100
2985
+ value: 8.677999999999999
2986
+ - type: map_at_1000
2987
+ value: 9.077
2988
+ - type: map_at_3
2989
+ value: 5.987
2990
+ - type: map_at_5
2991
+ value: 6.6979999999999995
2992
+ - type: mrr_at_1
2993
+ value: 35.65
2994
+ - type: mrr_at_10
2995
+ value: 45.097
2996
+ - type: mrr_at_100
2997
+ value: 45.83
2998
+ - type: mrr_at_1000
2999
+ value: 45.871
3000
+ - type: mrr_at_3
3001
+ value: 42.63
3002
+ - type: mrr_at_5
3003
+ value: 44.104
3004
+ - type: ndcg_at_1
3005
+ value: 29.215000000000003
3006
+ - type: ndcg_at_10
3007
+ value: 22.694
3008
+ - type: ndcg_at_100
3009
+ value: 22.242
3010
+ - type: ndcg_at_1000
3011
+ value: 27.069
3012
+ - type: ndcg_at_3
3013
+ value: 27.641
3014
+ - type: ndcg_at_5
3015
+ value: 25.503999999999998
3016
+ - type: precision_at_1
3017
+ value: 35.65
3018
+ - type: precision_at_10
3019
+ value: 12.795000000000002
3020
+ - type: precision_at_100
3021
+ value: 3.354
3022
+ - type: precision_at_1000
3023
+ value: 0.743
3024
+ - type: precision_at_3
3025
+ value: 23.403
3026
+ - type: precision_at_5
3027
+ value: 18.474
3028
+ - type: recall_at_1
3029
+ value: 3.9539999999999997
3030
+ - type: recall_at_10
3031
+ value: 11.301
3032
+ - type: recall_at_100
3033
+ value: 22.919999999999998
3034
+ - type: recall_at_1000
3035
+ value: 40.146
3036
+ - type: recall_at_3
3037
+ value: 7.146
3038
+ - type: recall_at_5
3039
+ value: 8.844000000000001
3040
+ - task:
3041
+ type: Retrieval
3042
+ dataset:
3043
+ type: jinaai/xmarket_de
3044
+ name: MTEB XMarket
3045
+ config: default
3046
+ split: test
3047
+ revision: 2336818db4c06570fcdf263e1bcb9993b786f67a
3048
+ metrics:
3049
+ - type: map_at_1
3050
+ value: 4.872
3051
+ - type: map_at_10
3052
+ value: 10.658
3053
+ - type: map_at_100
3054
+ value: 13.422999999999998
3055
+ - type: map_at_1000
3056
+ value: 14.245
3057
+ - type: map_at_3
3058
+ value: 7.857
3059
+ - type: map_at_5
3060
+ value: 9.142999999999999
3061
+ - type: mrr_at_1
3062
+ value: 16.744999999999997
3063
+ - type: mrr_at_10
3064
+ value: 24.416
3065
+ - type: mrr_at_100
3066
+ value: 25.432
3067
+ - type: mrr_at_1000
3068
+ value: 25.502999999999997
3069
+ - type: mrr_at_3
3070
+ value: 22.096
3071
+ - type: mrr_at_5
3072
+ value: 23.421
3073
+ - type: ndcg_at_1
3074
+ value: 16.695999999999998
3075
+ - type: ndcg_at_10
3076
+ value: 18.66
3077
+ - type: ndcg_at_100
3078
+ value: 24.314
3079
+ - type: ndcg_at_1000
3080
+ value: 29.846
3081
+ - type: ndcg_at_3
3082
+ value: 17.041999999999998
3083
+ - type: ndcg_at_5
3084
+ value: 17.585
3085
+ - type: precision_at_1
3086
+ value: 16.695999999999998
3087
+ - type: precision_at_10
3088
+ value: 10.374
3089
+ - type: precision_at_100
3090
+ value: 3.988
3091
+ - type: precision_at_1000
3092
+ value: 1.1860000000000002
3093
+ - type: precision_at_3
3094
+ value: 14.21
3095
+ - type: precision_at_5
3096
+ value: 12.623000000000001
3097
+ - type: recall_at_1
3098
+ value: 4.872
3099
+ - type: recall_at_10
3100
+ value: 18.624
3101
+ - type: recall_at_100
3102
+ value: 40.988
3103
+ - type: recall_at_1000
3104
+ value: 65.33
3105
+ - type: recall_at_3
3106
+ value: 10.162
3107
+ - type: recall_at_5
3108
+ value: 13.517999999999999
3109
+ ---
3110
+ <!-- TODO: add evaluation results here -->
3111
+ <br><br>
3112
+
3113
+ <p align="center">
3114
+ <img src="https://huggingface.co/datasets/jinaai/documentation-images/resolve/main/logo.webp" alt="Jina AI: Your Search Foundation, Supercharged!" width="150px">
3115
+ </p>
3116
+
3117
+
3118
+ <p align="center">
3119
+ <b>The text embedding set trained by <a href="https://jina.ai/"><b>Jina AI</b></a>.</b>
3120
+ </p>
3121
+
3122
+ ## Quick Start
3123
+
3124
+ The easiest way to starting using `jina-embeddings-v2-base-de` is to use Jina AI's [Embedding API](https://jina.ai/embeddings/).
3125
+
3126
+ ## Intended Usage & Model Info
3127
+
3128
+ `jina-embeddings-v2-base-de` is a German/English bilingual text **embedding model** supporting **8192 sequence length**.
3129
+ It is based on a BERT architecture (JinaBERT) that supports the symmetric bidirectional variant of [ALiBi](https://arxiv.org/abs/2108.12409) to allow longer sequence length.
3130
+ We have designed it for high performance in mono-lingual & cross-lingual applications and trained it specifically to support mixed German-English input without bias.
3131
+ Additionally, we provide the following embedding models:
3132
+
3133
+ `jina-embeddings-v2-base-de` ist ein zweisprachiges **Text Embedding Modell** für Deutsch und Englisch,
3134
+ welches Texteingaben mit einer Länge von bis zu **8192 Token unterstützt**.
3135
+ Es basiert auf der adaptierten Bert-Modell-Architektur JinaBERT,
3136
+ welche mithilfe einer symmetrische Variante von [ALiBi](https://arxiv.org/abs/2108.12409) längere Eingabetexte erlaubt.
3137
+ Wir haben, das Model für hohe Performance in einsprachigen und cross-lingual Anwendungen entwickelt und speziell darauf trainiert,
3138
+ gemischte deutsch-englische Eingaben ohne einen Bias zu kodieren.
3139
+ Des Weiteren stellen wir folgende Embedding-Modelle bereit:
3140
+
3141
+ - [`jina-embeddings-v2-small-en`](https://huggingface.co/jinaai/jina-embeddings-v2-small-en): 33 million parameters.
3142
+ - [`jina-embeddings-v2-base-en`](https://huggingface.co/jinaai/jina-embeddings-v2-base-en): 137 million parameters.
3143
+ - [`jina-embeddings-v2-base-zh`](https://huggingface.co/jinaai/jina-embeddings-v2-base-zh): 161 million parameters Chinese-English Bilingual embeddings.
3144
+ - [`jina-embeddings-v2-base-de`](https://huggingface.co/jinaai/jina-embeddings-v2-base-de): 161 million parameters German-English Bilingual embeddings **(you are here)**.
3145
+ - [`jina-embeddings-v2-base-es`](): Spanish-English Bilingual embeddings (soon).
3146
+ - [`jina-embeddings-v2-base-code`](https://huggingface.co/jinaai/jina-embeddings-v2-base-code): 161 million parameters code embeddings.
3147
+
3148
+ ## Data & Parameters
3149
+
3150
+ The data and training details are described in this [technical report](https://arxiv.org/abs/2402.17016).
3151
+
3152
+ ## Usage
3153
+
3154
+ **<details><summary>Please apply mean pooling when integrating the model.</summary>**
3155
+ <p>
3156
+
3157
+ ### Why mean pooling?
3158
+
3159
+ `mean poooling` takes all token embeddings from model output and averaging them at sentence/paragraph level.
3160
+ It has been proved to be the most effective way to produce high-quality sentence embeddings.
3161
+ We offer an `encode` function to deal with this.
3162
+
3163
+ However, if you would like to do it without using the default `encode` function:
3164
+
3165
+ ```python
3166
+ import torch
3167
+ import torch.nn.functional as F
3168
+ from transformers import AutoTokenizer, AutoModel
3169
+
3170
+ def mean_pooling(model_output, attention_mask):
3171
+ token_embeddings = model_output[0]
3172
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
3173
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
3174
+
3175
+ sentences = ['How is the weather today?', 'What is the current weather like today?']
3176
+
3177
+ tokenizer = AutoTokenizer.from_pretrained('jinaai/jina-embeddings-v2-base-de')
3178
+ model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-de', trust_remote_code=True, torch_dtype=torch.bfloat16)
3179
+
3180
+ encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
3181
+
3182
+ with torch.no_grad():
3183
+ model_output = model(**encoded_input)
3184
+
3185
+ embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
3186
+ embeddings = F.normalize(embeddings, p=2, dim=1)
3187
+ ```
3188
+
3189
+ </p>
3190
+ </details>
3191
+
3192
+ You can use Jina Embedding models directly from transformers package.
3193
+
3194
+ ```python
3195
+ !pip install transformers
3196
+ import torch
3197
+ from transformers import AutoModel
3198
+ from numpy.linalg import norm
3199
+
3200
+ cos_sim = lambda a,b: (a @ b.T) / (norm(a)*norm(b))
3201
+ model = AutoModel.from_pretrained('jinaai/jina-embeddings-v2-base-de', trust_remote_code=True, torch_dtype=torch.bfloat16)
3202
+ embeddings = model.encode(['How is the weather today?', 'Wie ist das Wetter heute?'])
3203
+ print(cos_sim(embeddings[0], embeddings[1]))
3204
+ ```
3205
+
3206
+ If you only want to handle shorter sequence, such as 2k, pass the `max_length` parameter to the `encode` function:
3207
+
3208
+ ```python
3209
+ embeddings = model.encode(
3210
+ ['Very long ... document'],
3211
+ max_length=2048
3212
+ )
3213
+ ```
3214
+
3215
+ Using the its latest release (v2.3.0) sentence-transformers also supports Jina embeddings (Please make sure that you are logged into huggingface as well):
3216
+
3217
+ ```python
3218
+ !pip install -U sentence-transformers
3219
+ from sentence_transformers import SentenceTransformer
3220
+ from sentence_transformers.util import cos_sim
3221
+
3222
+ model = SentenceTransformer(
3223
+ "jinaai/jina-embeddings-v2-base-de", # switch to en/zh for English or Chinese
3224
+ trust_remote_code=True
3225
+ )
3226
+
3227
+ # control your input sequence length up to 8192
3228
+ model.max_seq_length = 1024
3229
+
3230
+ embeddings = model.encode([
3231
+ 'How is the weather today?',
3232
+ 'Wie ist das Wetter heute?'
3233
+ ])
3234
+ print(cos_sim(embeddings[0], embeddings[1]))
3235
+ ```
3236
+
3237
+ ## Alternatives to Using Transformers Package
3238
+
3239
+ 1. _Managed SaaS_: Get started with a free key on Jina AI's [Embedding API](https://jina.ai/embeddings/).
3240
+ 2. _Private and high-performance deployment_: Get started by picking from our suite of models and deploy them on [AWS Sagemaker](https://aws.amazon.com/marketplace/seller-profile?id=seller-stch2ludm6vgy).
3241
+
3242
+ ## Benchmark Results
3243
+
3244
+ We evaluated our Bilingual model on all German and English evaluation tasks availble on the [MTEB benchmark](https://huggingface.co/blog/mteb). In addition, we evaluated the models agains a couple of other German, English, and multilingual models on additional German evaluation tasks:
3245
+
3246
+ <img src="de_evaluation_results.png" width="780px">
3247
+
3248
+ ## Use Jina Embeddings for RAG
3249
+
3250
+ According to the latest blog post from [LLamaIndex](https://blog.llamaindex.ai/boosting-rag-picking-the-best-embedding-reranker-models-42d079022e83),
3251
+
3252
+ > In summary, to achieve the peak performance in both hit rate and MRR, the combination of OpenAI or JinaAI-Base embeddings with the CohereRerank/bge-reranker-large reranker stands out.
3253
+
3254
+ <img src="https://miro.medium.com/v2/resize:fit:4800/format:webp/1*ZP2RVejCZovF3FDCg-Bx3A.png" width="780px">
3255
+
3256
+ ## Contact
3257
+
3258
+ Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.
3259
+
3260
+ ## Citation
3261
+
3262
+ If you find Jina Embeddings useful in your research, please cite the following paper:
3263
+
3264
+ ```
3265
+ @article{mohr2024multi,
3266
+ title={Multi-Task Contrastive Learning for 8192-Token Bilingual Text Embeddings},
3267
+ author={Mohr, Isabelle and Krimmel, Markus and Sturua, Saba and Akram, Mohammad Kalim and Koukounas, Andreas and G{\"u}nther, Michael and Mastrapas, Georgios and Ravishankar, Vinit and Mart{\'\i}nez, Joan Fontanals and Wang, Feng and others},
3268
+ journal={arXiv preprint arXiv:2402.17016},
3269
+ year={2024}
3270
+ }
3271
+ ```
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "jinaai/jina-bert-implementation",
3
+ "model_max_length": 8192,
4
+ "architectures": [
5
+ "JinaBertForMaskedLM"
6
+ ],
7
+ "attention_probs_dropout_prob": 0.0,
8
+ "auto_map": {
9
+ "AutoConfig": "jinaai/jina-bert-implementation--configuration_bert.JinaBertConfig",
10
+ "AutoModelForMaskedLM": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForMaskedLM",
11
+ "AutoModel": "jinaai/jina-bert-implementation--modeling_bert.JinaBertModel",
12
+ "AutoModelForSequenceClassification": "jinaai/jina-bert-implementation--modeling_bert.JinaBertForSequenceClassification"
13
+ },
14
+ "classifier_dropout": null,
15
+ "emb_pooler": "mean",
16
+ "feed_forward_type": "geglu",
17
+ "gradient_checkpointing": false,
18
+ "hidden_act": "gelu",
19
+ "hidden_dropout_prob": 0.1,
20
+ "hidden_size": 768,
21
+ "initializer_range": 0.02,
22
+ "intermediate_size": 3072,
23
+ "layer_norm_eps": 1e-12,
24
+ "max_position_embeddings": 8192,
25
+ "model_type": "bert",
26
+ "num_attention_heads": 12,
27
+ "num_hidden_layers": 12,
28
+ "pad_token_id": 0,
29
+ "position_embedding_type": "alibi",
30
+ "torch_dtype": "float16",
31
+ "transformers_version": "4.31.0",
32
+ "type_vocab_size": 2,
33
+ "use_cache": true,
34
+ "vocab_size": 61056
35
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.2",
4
+ "transformers": "4.31.0",
5
+ "pytorch": "2.0.1"
6
+ }
7
+ }
gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0ffca9240c502ef64a63917baf0e60648a7a438dfd49a4f91ad4d5dab3428cf8
3
+ size 321648328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false,
4
+ "model_args": {"trust_remote_code": true}
5
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "pad_token": "<pad>",
53
+ "sep_token": "</s>",
54
+ "tokenizer_class": "RobertaTokenizer",
55
+ "trim_offsets": true,
56
+ "unk_token": "<unk>"
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff