zyznull's picture
Update README.md
05481d1 verified
|
raw
history blame
96.2 kB
---
license: apache-2.0
base_model:
- Qwen/Qwen2-VL-2B-Instruct
language:
- en
- zh
tags:
- mteb
- sentence-transformers
- transformers
- Qwen2-VL
- sentence-similarity
- vidore
model-index:
- name: gme-Qwen2-VL-2B-Instruct
results:
- task:
type: Classification
dataset:
type: mteb/amazon_counterfactual
name: MTEB AmazonCounterfactualClassification (en)
config: en
split: test
revision: e8379541af4e31359cca9fbcf4b00f2671dba205
metrics:
- type: accuracy
value: 72.55223880597015
- type: ap
value: 35.01515316721116
- type: f1
value: 66.44086070814382
- task:
type: Classification
dataset:
type: mteb/amazon_polarity
name: MTEB AmazonPolarityClassification
config: default
split: test
revision: e2d317d38cd51312af73b3d32a06d1a08b442046
metrics:
- type: accuracy
value: 96.75819999999999
- type: ap
value: 95.51009242092881
- type: f1
value: 96.75713119357414
- task:
type: Classification
dataset:
type: mteb/amazon_reviews_multi
name: MTEB AmazonReviewsClassification (en)
config: en
split: test
revision: 1399c76144fd37290681b995c656ef9b2e06e26d
metrics:
- type: accuracy
value: 61.971999999999994
- type: f1
value: 60.50745575187704
- task:
type: Retrieval
dataset:
type: mteb/arguana
name: MTEB ArguAna
config: default
split: test
revision: c22ab2a51041ffd869aaddef7af8d8215647e41a
metrics:
- type: map_at_1
value: 36.272999999999996
- type: map_at_10
value: 52.782
- type: map_at_100
value: 53.339999999999996
- type: map_at_1000
value: 53.342999999999996
- type: map_at_3
value: 48.4
- type: map_at_5
value: 50.882000000000005
- type: mrr_at_1
value: 36.984
- type: mrr_at_10
value: 53.052
- type: mrr_at_100
value: 53.604
- type: mrr_at_1000
value: 53.607000000000006
- type: mrr_at_3
value: 48.613
- type: mrr_at_5
value: 51.159
- type: ndcg_at_1
value: 36.272999999999996
- type: ndcg_at_10
value: 61.524
- type: ndcg_at_100
value: 63.796
- type: ndcg_at_1000
value: 63.869
- type: ndcg_at_3
value: 52.456
- type: ndcg_at_5
value: 56.964000000000006
- type: precision_at_1
value: 36.272999999999996
- type: precision_at_10
value: 8.926
- type: precision_at_100
value: 0.989
- type: precision_at_1000
value: 0.1
- type: precision_at_3
value: 21.407999999999998
- type: precision_at_5
value: 15.049999999999999
- type: recall_at_1
value: 36.272999999999996
- type: recall_at_10
value: 89.25999999999999
- type: recall_at_100
value: 98.933
- type: recall_at_1000
value: 99.502
- type: recall_at_3
value: 64.225
- type: recall_at_5
value: 75.249
- task:
type: Clustering
dataset:
type: mteb/arxiv-clustering-p2p
name: MTEB ArxivClusteringP2P
config: default
split: test
revision: a122ad7f3f0291bf49cc6f4d32aa80929df69d5d
metrics:
- type: v_measure
value: 52.45236368396085
- task:
type: Clustering
dataset:
type: mteb/arxiv-clustering-s2s
name: MTEB ArxivClusteringS2S
config: default
split: test
revision: f910caf1a6075f7329cdf8c1a6135696f37dbd53
metrics:
- type: v_measure
value: 46.83781937870832
- task:
type: Reranking
dataset:
type: mteb/askubuntudupquestions-reranking
name: MTEB AskUbuntuDupQuestions
config: default
split: test
revision: 2000358ca161889fa9c082cb41daa8dcfb161a54
metrics:
- type: map
value: 60.653430349851746
- type: mrr
value: 74.28736314470387
- task:
type: STS
dataset:
type: mteb/biosses-sts
name: MTEB BIOSSES
config: default
split: test
revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
metrics:
- type: cos_sim_pearson
value: 89.18568151905953
- type: cos_sim_spearman
value: 86.47666922475281
- type: euclidean_pearson
value: 87.25416218056225
- type: euclidean_spearman
value: 86.47666922475281
- type: manhattan_pearson
value: 87.04960508086356
- type: manhattan_spearman
value: 86.73992823533615
- task:
type: Classification
dataset:
type: mteb/banking77
name: MTEB Banking77Classification
config: default
split: test
revision: 0fd18e25b25c072e09e0d92ab615fda904d66300
metrics:
- type: accuracy
value: 80.2435064935065
- type: f1
value: 79.44078343737895
- task:
type: Clustering
dataset:
type: mteb/biorxiv-clustering-p2p
name: MTEB BiorxivClusteringP2P
config: default
split: test
revision: 65b79d1d13f80053f67aca9498d9402c2d9f1f40
metrics:
- type: v_measure
value: 44.68220155432257
- task:
type: Clustering
dataset:
type: mteb/biorxiv-clustering-s2s
name: MTEB BiorxivClusteringS2S
config: default
split: test
revision: 258694dd0231531bc1fd9de6ceb52a0853c6d908
metrics:
- type: v_measure
value: 40.666150477589284
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackAndroidRetrieval
config: default
split: test
revision: f46a197baaae43b4f621051089b82a364682dfeb
metrics:
- type: map_at_1
value: 30.623
- type: map_at_10
value: 40.482
- type: map_at_100
value: 41.997
- type: map_at_1000
value: 42.135
- type: map_at_3
value: 37.754
- type: map_at_5
value: 39.031
- type: mrr_at_1
value: 37.482
- type: mrr_at_10
value: 46.311
- type: mrr_at_100
value: 47.211999999999996
- type: mrr_at_1000
value: 47.27
- type: mrr_at_3
value: 44.157999999999994
- type: mrr_at_5
value: 45.145
- type: ndcg_at_1
value: 37.482
- type: ndcg_at_10
value: 46.142
- type: ndcg_at_100
value: 51.834
- type: ndcg_at_1000
value: 54.164
- type: ndcg_at_3
value: 42.309000000000005
- type: ndcg_at_5
value: 43.485
- type: precision_at_1
value: 37.482
- type: precision_at_10
value: 8.455
- type: precision_at_100
value: 1.3780000000000001
- type: precision_at_1000
value: 0.188
- type: precision_at_3
value: 20.172
- type: precision_at_5
value: 13.705
- type: recall_at_1
value: 30.623
- type: recall_at_10
value: 56.77100000000001
- type: recall_at_100
value: 80.034
- type: recall_at_1000
value: 94.62899999999999
- type: recall_at_3
value: 44.663000000000004
- type: recall_at_5
value: 48.692
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackEnglishRetrieval
config: default
split: test
revision: ad9991cb51e31e31e430383c75ffb2885547b5f0
metrics:
- type: map_at_1
value: 27.941
- type: map_at_10
value: 38.437
- type: map_at_100
value: 39.625
- type: map_at_1000
value: 39.753
- type: map_at_3
value: 35.388999999999996
- type: map_at_5
value: 37.113
- type: mrr_at_1
value: 34.522000000000006
- type: mrr_at_10
value: 43.864999999999995
- type: mrr_at_100
value: 44.533
- type: mrr_at_1000
value: 44.580999999999996
- type: mrr_at_3
value: 41.55
- type: mrr_at_5
value: 42.942
- type: ndcg_at_1
value: 34.522000000000006
- type: ndcg_at_10
value: 44.330000000000005
- type: ndcg_at_100
value: 48.61
- type: ndcg_at_1000
value: 50.712999999999994
- type: ndcg_at_3
value: 39.834
- type: ndcg_at_5
value: 42.016
- type: precision_at_1
value: 34.522000000000006
- type: precision_at_10
value: 8.471
- type: precision_at_100
value: 1.3379999999999999
- type: precision_at_1000
value: 0.182
- type: precision_at_3
value: 19.363
- type: precision_at_5
value: 13.898
- type: recall_at_1
value: 27.941
- type: recall_at_10
value: 55.336
- type: recall_at_100
value: 73.51100000000001
- type: recall_at_1000
value: 86.636
- type: recall_at_3
value: 42.54
- type: recall_at_5
value: 48.392
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackGamingRetrieval
config: default
split: test
revision: 4885aa143210c98657558c04aaf3dc47cfb54340
metrics:
- type: map_at_1
value: 32.681
- type: map_at_10
value: 45.48
- type: map_at_100
value: 46.542
- type: map_at_1000
value: 46.604
- type: map_at_3
value: 42.076
- type: map_at_5
value: 44.076
- type: mrr_at_1
value: 37.492
- type: mrr_at_10
value: 48.746
- type: mrr_at_100
value: 49.485
- type: mrr_at_1000
value: 49.517
- type: mrr_at_3
value: 45.998
- type: mrr_at_5
value: 47.681000000000004
- type: ndcg_at_1
value: 37.492
- type: ndcg_at_10
value: 51.778999999999996
- type: ndcg_at_100
value: 56.294
- type: ndcg_at_1000
value: 57.58
- type: ndcg_at_3
value: 45.856
- type: ndcg_at_5
value: 48.968
- type: precision_at_1
value: 37.492
- type: precision_at_10
value: 8.620999999999999
- type: precision_at_100
value: 1.189
- type: precision_at_1000
value: 0.135
- type: precision_at_3
value: 20.773
- type: precision_at_5
value: 14.596
- type: recall_at_1
value: 32.681
- type: recall_at_10
value: 67.196
- type: recall_at_100
value: 87.027
- type: recall_at_1000
value: 96.146
- type: recall_at_3
value: 51.565000000000005
- type: recall_at_5
value: 59.123999999999995
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackGisRetrieval
config: default
split: test
revision: 5003b3064772da1887988e05400cf3806fe491f2
metrics:
- type: map_at_1
value: 22.421
- type: map_at_10
value: 30.127
- type: map_at_100
value: 31.253999999999998
- type: map_at_1000
value: 31.344
- type: map_at_3
value: 27.673
- type: map_at_5
value: 29.182000000000002
- type: mrr_at_1
value: 24.068
- type: mrr_at_10
value: 31.857000000000003
- type: mrr_at_100
value: 32.808
- type: mrr_at_1000
value: 32.881
- type: mrr_at_3
value: 29.397000000000002
- type: mrr_at_5
value: 30.883
- type: ndcg_at_1
value: 24.068
- type: ndcg_at_10
value: 34.642
- type: ndcg_at_100
value: 40.327
- type: ndcg_at_1000
value: 42.55
- type: ndcg_at_3
value: 29.868
- type: ndcg_at_5
value: 32.461
- type: precision_at_1
value: 24.068
- type: precision_at_10
value: 5.390000000000001
- type: precision_at_100
value: 0.873
- type: precision_at_1000
value: 0.109
- type: precision_at_3
value: 12.692999999999998
- type: precision_at_5
value: 9.107
- type: recall_at_1
value: 22.421
- type: recall_at_10
value: 46.846
- type: recall_at_100
value: 73.409
- type: recall_at_1000
value: 90.06
- type: recall_at_3
value: 34.198
- type: recall_at_5
value: 40.437
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackMathematicaRetrieval
config: default
split: test
revision: 90fceea13679c63fe563ded68f3b6f06e50061de
metrics:
- type: map_at_1
value: 16.494
- type: map_at_10
value: 24.4
- type: map_at_100
value: 25.718999999999998
- type: map_at_1000
value: 25.840000000000003
- type: map_at_3
value: 21.731
- type: map_at_5
value: 23.247999999999998
- type: mrr_at_1
value: 20.274
- type: mrr_at_10
value: 28.866000000000003
- type: mrr_at_100
value: 29.889
- type: mrr_at_1000
value: 29.957
- type: mrr_at_3
value: 26.284999999999997
- type: mrr_at_5
value: 27.79
- type: ndcg_at_1
value: 20.274
- type: ndcg_at_10
value: 29.666999999999998
- type: ndcg_at_100
value: 36.095
- type: ndcg_at_1000
value: 38.87
- type: ndcg_at_3
value: 24.672
- type: ndcg_at_5
value: 27.106
- type: precision_at_1
value: 20.274
- type: precision_at_10
value: 5.5969999999999995
- type: precision_at_100
value: 1.04
- type: precision_at_1000
value: 0.14100000000000001
- type: precision_at_3
value: 12.023
- type: precision_at_5
value: 8.98
- type: recall_at_1
value: 16.494
- type: recall_at_10
value: 41.400999999999996
- type: recall_at_100
value: 69.811
- type: recall_at_1000
value: 89.422
- type: recall_at_3
value: 27.834999999999997
- type: recall_at_5
value: 33.774
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackPhysicsRetrieval
config: default
split: test
revision: 79531abbd1fb92d06c6d6315a0cbbbf5bb247ea4
metrics:
- type: map_at_1
value: 26.150000000000002
- type: map_at_10
value: 36.012
- type: map_at_100
value: 37.377
- type: map_at_1000
value: 37.497
- type: map_at_3
value: 32.712
- type: map_at_5
value: 34.475
- type: mrr_at_1
value: 32.05
- type: mrr_at_10
value: 41.556
- type: mrr_at_100
value: 42.451
- type: mrr_at_1000
value: 42.498000000000005
- type: mrr_at_3
value: 38.659
- type: mrr_at_5
value: 40.314
- type: ndcg_at_1
value: 32.05
- type: ndcg_at_10
value: 42.132
- type: ndcg_at_100
value: 48.028999999999996
- type: ndcg_at_1000
value: 50.229
- type: ndcg_at_3
value: 36.622
- type: ndcg_at_5
value: 39.062000000000005
- type: precision_at_1
value: 32.05
- type: precision_at_10
value: 7.767
- type: precision_at_100
value: 1.269
- type: precision_at_1000
value: 0.164
- type: precision_at_3
value: 17.355999999999998
- type: precision_at_5
value: 12.474
- type: recall_at_1
value: 26.150000000000002
- type: recall_at_10
value: 55.205000000000005
- type: recall_at_100
value: 80.2
- type: recall_at_1000
value: 94.524
- type: recall_at_3
value: 39.322
- type: recall_at_5
value: 45.761
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackProgrammersRetrieval
config: default
split: test
revision: 6184bc1440d2dbc7612be22b50686b8826d22b32
metrics:
- type: map_at_1
value: 23.741
- type: map_at_10
value: 33.51
- type: map_at_100
value: 34.882999999999996
- type: map_at_1000
value: 34.995
- type: map_at_3
value: 30.514000000000003
- type: map_at_5
value: 32.085
- type: mrr_at_1
value: 28.653000000000002
- type: mrr_at_10
value: 38.059
- type: mrr_at_100
value: 39.050000000000004
- type: mrr_at_1000
value: 39.107
- type: mrr_at_3
value: 35.445
- type: mrr_at_5
value: 36.849
- type: ndcg_at_1
value: 28.653000000000002
- type: ndcg_at_10
value: 39.186
- type: ndcg_at_100
value: 45.301
- type: ndcg_at_1000
value: 47.547
- type: ndcg_at_3
value: 34.103
- type: ndcg_at_5
value: 36.239
- type: precision_at_1
value: 28.653000000000002
- type: precision_at_10
value: 7.295
- type: precision_at_100
value: 1.2189999999999999
- type: precision_at_1000
value: 0.159
- type: precision_at_3
value: 16.438
- type: precision_at_5
value: 11.804
- type: recall_at_1
value: 23.741
- type: recall_at_10
value: 51.675000000000004
- type: recall_at_100
value: 78.13799999999999
- type: recall_at_1000
value: 93.12700000000001
- type: recall_at_3
value: 37.033
- type: recall_at_5
value: 42.793
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackStatsRetrieval
config: default
split: test
revision: 65ac3a16b8e91f9cee4c9828cc7c335575432a2a
metrics:
- type: map_at_1
value: 23.452
- type: map_at_10
value: 30.231
- type: map_at_100
value: 31.227
- type: map_at_1000
value: 31.338
- type: map_at_3
value: 28.083000000000002
- type: map_at_5
value: 29.125
- type: mrr_at_1
value: 25.613000000000003
- type: mrr_at_10
value: 32.62
- type: mrr_at_100
value: 33.469
- type: mrr_at_1000
value: 33.554
- type: mrr_at_3
value: 30.368000000000002
- type: mrr_at_5
value: 31.502999999999997
- type: ndcg_at_1
value: 25.613000000000003
- type: ndcg_at_10
value: 34.441
- type: ndcg_at_100
value: 39.253
- type: ndcg_at_1000
value: 42.105
- type: ndcg_at_3
value: 30.183
- type: ndcg_at_5
value: 31.917
- type: precision_at_1
value: 25.613000000000003
- type: precision_at_10
value: 5.367999999999999
- type: precision_at_100
value: 0.848
- type: precision_at_1000
value: 0.117
- type: precision_at_3
value: 12.73
- type: precision_at_5
value: 8.773
- type: recall_at_1
value: 23.452
- type: recall_at_10
value: 45.021
- type: recall_at_100
value: 66.563
- type: recall_at_1000
value: 87.713
- type: recall_at_3
value: 33.433
- type: recall_at_5
value: 37.637
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackTexRetrieval
config: default
split: test
revision: 46989137a86843e03a6195de44b09deda022eec7
metrics:
- type: map_at_1
value: 16.11
- type: map_at_10
value: 22.832
- type: map_at_100
value: 23.829
- type: map_at_1000
value: 23.959
- type: map_at_3
value: 20.66
- type: map_at_5
value: 21.851000000000003
- type: mrr_at_1
value: 19.408
- type: mrr_at_10
value: 26.354
- type: mrr_at_100
value: 27.237000000000002
- type: mrr_at_1000
value: 27.32
- type: mrr_at_3
value: 24.243000000000002
- type: mrr_at_5
value: 25.430000000000003
- type: ndcg_at_1
value: 19.408
- type: ndcg_at_10
value: 27.239
- type: ndcg_at_100
value: 32.286
- type: ndcg_at_1000
value: 35.498000000000005
- type: ndcg_at_3
value: 23.244
- type: ndcg_at_5
value: 25.080999999999996
- type: precision_at_1
value: 19.408
- type: precision_at_10
value: 4.917
- type: precision_at_100
value: 0.874
- type: precision_at_1000
value: 0.133
- type: precision_at_3
value: 10.863
- type: precision_at_5
value: 7.887
- type: recall_at_1
value: 16.11
- type: recall_at_10
value: 37.075
- type: recall_at_100
value: 60.251999999999995
- type: recall_at_1000
value: 83.38600000000001
- type: recall_at_3
value: 25.901999999999997
- type: recall_at_5
value: 30.612000000000002
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackUnixRetrieval
config: default
split: test
revision: 6c6430d3a6d36f8d2a829195bc5dc94d7e063e53
metrics:
- type: map_at_1
value: 25.941
- type: map_at_10
value: 33.711999999999996
- type: map_at_100
value: 34.926
- type: map_at_1000
value: 35.05
- type: map_at_3
value: 31.075000000000003
- type: map_at_5
value: 32.611000000000004
- type: mrr_at_1
value: 30.784
- type: mrr_at_10
value: 38.079
- type: mrr_at_100
value: 39.018
- type: mrr_at_1000
value: 39.09
- type: mrr_at_3
value: 35.603
- type: mrr_at_5
value: 36.988
- type: ndcg_at_1
value: 30.784
- type: ndcg_at_10
value: 38.586
- type: ndcg_at_100
value: 44.205
- type: ndcg_at_1000
value: 46.916000000000004
- type: ndcg_at_3
value: 33.899
- type: ndcg_at_5
value: 36.11
- type: precision_at_1
value: 30.784
- type: precision_at_10
value: 6.409
- type: precision_at_100
value: 1.034
- type: precision_at_1000
value: 0.13799999999999998
- type: precision_at_3
value: 15.112
- type: precision_at_5
value: 10.728
- type: recall_at_1
value: 25.941
- type: recall_at_10
value: 49.242999999999995
- type: recall_at_100
value: 73.85000000000001
- type: recall_at_1000
value: 92.782
- type: recall_at_3
value: 36.204
- type: recall_at_5
value: 41.908
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackWebmastersRetrieval
config: default
split: test
revision: 160c094312a0e1facb97e55eeddb698c0abe3571
metrics:
- type: map_at_1
value: 24.401999999999997
- type: map_at_10
value: 33.195
- type: map_at_100
value: 34.699999999999996
- type: map_at_1000
value: 34.946
- type: map_at_3
value: 30.570999999999998
- type: map_at_5
value: 32.0
- type: mrr_at_1
value: 28.656
- type: mrr_at_10
value: 37.039
- type: mrr_at_100
value: 38.049
- type: mrr_at_1000
value: 38.108
- type: mrr_at_3
value: 34.717
- type: mrr_at_5
value: 36.07
- type: ndcg_at_1
value: 28.656
- type: ndcg_at_10
value: 38.557
- type: ndcg_at_100
value: 44.511
- type: ndcg_at_1000
value: 47.346
- type: ndcg_at_3
value: 34.235
- type: ndcg_at_5
value: 36.260999999999996
- type: precision_at_1
value: 28.656
- type: precision_at_10
value: 7.312
- type: precision_at_100
value: 1.451
- type: precision_at_1000
value: 0.242
- type: precision_at_3
value: 15.942
- type: precision_at_5
value: 11.66
- type: recall_at_1
value: 24.401999999999997
- type: recall_at_10
value: 48.791000000000004
- type: recall_at_100
value: 76.211
- type: recall_at_1000
value: 93.92
- type: recall_at_3
value: 36.975
- type: recall_at_5
value: 42.01
- task:
type: Retrieval
dataset:
type: BeIR/cqadupstack
name: MTEB CQADupstackWordpressRetrieval
config: default
split: test
revision: 4ffe81d471b1924886b33c7567bfb200e9eec5c4
metrics:
- type: map_at_1
value: 19.07
- type: map_at_10
value: 26.608999999999998
- type: map_at_100
value: 27.625
- type: map_at_1000
value: 27.743000000000002
- type: map_at_3
value: 24.532999999999998
- type: map_at_5
value: 25.671
- type: mrr_at_1
value: 20.518
- type: mrr_at_10
value: 28.541
- type: mrr_at_100
value: 29.453000000000003
- type: mrr_at_1000
value: 29.536
- type: mrr_at_3
value: 26.71
- type: mrr_at_5
value: 27.708
- type: ndcg_at_1
value: 20.518
- type: ndcg_at_10
value: 30.855
- type: ndcg_at_100
value: 35.973
- type: ndcg_at_1000
value: 38.827
- type: ndcg_at_3
value: 26.868
- type: ndcg_at_5
value: 28.74
- type: precision_at_1
value: 20.518
- type: precision_at_10
value: 4.843
- type: precision_at_100
value: 0.799
- type: precision_at_1000
value: 0.116
- type: precision_at_3
value: 11.645
- type: precision_at_5
value: 8.133
- type: recall_at_1
value: 19.07
- type: recall_at_10
value: 41.925000000000004
- type: recall_at_100
value: 65.68
- type: recall_at_1000
value: 86.713
- type: recall_at_3
value: 31.251
- type: recall_at_5
value: 35.653
- task:
type: Retrieval
dataset:
type: mteb/climate-fever
name: MTEB ClimateFEVER
config: default
split: test
revision: 47f2ac6acb640fc46020b02a5b59fdda04d39380
metrics:
- type: map_at_1
value: 18.762
- type: map_at_10
value: 32.412
- type: map_at_100
value: 34.506
- type: map_at_1000
value: 34.678
- type: map_at_3
value: 27.594
- type: map_at_5
value: 30.128
- type: mrr_at_1
value: 42.345
- type: mrr_at_10
value: 54.443
- type: mrr_at_100
value: 55.05799999999999
- type: mrr_at_1000
value: 55.076
- type: mrr_at_3
value: 51.553000000000004
- type: mrr_at_5
value: 53.269
- type: ndcg_at_1
value: 42.345
- type: ndcg_at_10
value: 42.304
- type: ndcg_at_100
value: 49.425000000000004
- type: ndcg_at_1000
value: 52.123
- type: ndcg_at_3
value: 36.271
- type: ndcg_at_5
value: 38.216
- type: precision_at_1
value: 42.345
- type: precision_at_10
value: 12.808
- type: precision_at_100
value: 2.062
- type: precision_at_1000
value: 0.258
- type: precision_at_3
value: 26.840000000000003
- type: precision_at_5
value: 20.052
- type: recall_at_1
value: 18.762
- type: recall_at_10
value: 47.976
- type: recall_at_100
value: 71.86
- type: recall_at_1000
value: 86.61999999999999
- type: recall_at_3
value: 32.708999999999996
- type: recall_at_5
value: 39.151
- task:
type: Retrieval
dataset:
type: mteb/dbpedia
name: MTEB DBPedia
config: default
split: test
revision: c0f706b76e590d620bd6618b3ca8efdd34e2d659
metrics:
- type: map_at_1
value: 9.685
- type: map_at_10
value: 21.65
- type: map_at_100
value: 30.952
- type: map_at_1000
value: 33.049
- type: map_at_3
value: 14.953
- type: map_at_5
value: 17.592
- type: mrr_at_1
value: 72.0
- type: mrr_at_10
value: 78.054
- type: mrr_at_100
value: 78.41900000000001
- type: mrr_at_1000
value: 78.425
- type: mrr_at_3
value: 76.5
- type: mrr_at_5
value: 77.28699999999999
- type: ndcg_at_1
value: 61.25000000000001
- type: ndcg_at_10
value: 46.306000000000004
- type: ndcg_at_100
value: 50.867
- type: ndcg_at_1000
value: 58.533
- type: ndcg_at_3
value: 50.857
- type: ndcg_at_5
value: 48.283
- type: precision_at_1
value: 72.0
- type: precision_at_10
value: 37.3
- type: precision_at_100
value: 11.95
- type: precision_at_1000
value: 2.528
- type: precision_at_3
value: 53.583000000000006
- type: precision_at_5
value: 46.6
- type: recall_at_1
value: 9.685
- type: recall_at_10
value: 27.474999999999998
- type: recall_at_100
value: 56.825
- type: recall_at_1000
value: 81.792
- type: recall_at_3
value: 15.939
- type: recall_at_5
value: 19.853
- task:
type: Classification
dataset:
type: mteb/emotion
name: MTEB EmotionClassification
config: default
split: test
revision: 4f58c6b202a23cf9a4da393831edf4f9183cad37
metrics:
- type: accuracy
value: 62.805000000000014
- type: f1
value: 56.401757250989384
- task:
type: Retrieval
dataset:
type: mteb/fever
name: MTEB FEVER
config: default
split: test
revision: bea83ef9e8fb933d90a2f1d5515737465d613e12
metrics:
- type: map_at_1
value: 83.734
- type: map_at_10
value: 90.089
- type: map_at_100
value: 90.274
- type: map_at_1000
value: 90.286
- type: map_at_3
value: 89.281
- type: map_at_5
value: 89.774
- type: mrr_at_1
value: 90.039
- type: mrr_at_10
value: 94.218
- type: mrr_at_100
value: 94.24
- type: mrr_at_1000
value: 94.24
- type: mrr_at_3
value: 93.979
- type: mrr_at_5
value: 94.137
- type: ndcg_at_1
value: 90.039
- type: ndcg_at_10
value: 92.597
- type: ndcg_at_100
value: 93.147
- type: ndcg_at_1000
value: 93.325
- type: ndcg_at_3
value: 91.64999999999999
- type: ndcg_at_5
value: 92.137
- type: precision_at_1
value: 90.039
- type: precision_at_10
value: 10.809000000000001
- type: precision_at_100
value: 1.133
- type: precision_at_1000
value: 0.116
- type: precision_at_3
value: 34.338
- type: precision_at_5
value: 21.089
- type: recall_at_1
value: 83.734
- type: recall_at_10
value: 96.161
- type: recall_at_100
value: 98.137
- type: recall_at_1000
value: 99.182
- type: recall_at_3
value: 93.551
- type: recall_at_5
value: 94.878
- task:
type: Retrieval
dataset:
type: mteb/fiqa
name: MTEB FiQA2018
config: default
split: test
revision: 27a168819829fe9bcd655c2df245fb19452e8e06
metrics:
- type: map_at_1
value: 24.529999999999998
- type: map_at_10
value: 37.229
- type: map_at_100
value: 39.333
- type: map_at_1000
value: 39.491
- type: map_at_3
value: 32.177
- type: map_at_5
value: 35.077999999999996
- type: mrr_at_1
value: 45.678999999999995
- type: mrr_at_10
value: 53.952
- type: mrr_at_100
value: 54.727000000000004
- type: mrr_at_1000
value: 54.761
- type: mrr_at_3
value: 51.568999999999996
- type: mrr_at_5
value: 52.973000000000006
- type: ndcg_at_1
value: 45.678999999999995
- type: ndcg_at_10
value: 45.297
- type: ndcg_at_100
value: 52.516
- type: ndcg_at_1000
value: 55.16
- type: ndcg_at_3
value: 40.569
- type: ndcg_at_5
value: 42.49
- type: precision_at_1
value: 45.678999999999995
- type: precision_at_10
value: 12.269
- type: precision_at_100
value: 1.9709999999999999
- type: precision_at_1000
value: 0.244
- type: precision_at_3
value: 25.72
- type: precision_at_5
value: 19.66
- type: recall_at_1
value: 24.529999999999998
- type: recall_at_10
value: 51.983999999999995
- type: recall_at_100
value: 78.217
- type: recall_at_1000
value: 94.104
- type: recall_at_3
value: 36.449999999999996
- type: recall_at_5
value: 43.336999999999996
- task:
type: Retrieval
dataset:
type: mteb/hotpotqa
name: MTEB HotpotQA
config: default
split: test
revision: ab518f4d6fcca38d87c25209f94beba119d02014
metrics:
- type: map_at_1
value: 41.519
- type: map_at_10
value: 64.705
- type: map_at_100
value: 65.554
- type: map_at_1000
value: 65.613
- type: map_at_3
value: 61.478
- type: map_at_5
value: 63.55800000000001
- type: mrr_at_1
value: 83.038
- type: mrr_at_10
value: 87.82900000000001
- type: mrr_at_100
value: 87.96000000000001
- type: mrr_at_1000
value: 87.96300000000001
- type: mrr_at_3
value: 87.047
- type: mrr_at_5
value: 87.546
- type: ndcg_at_1
value: 83.038
- type: ndcg_at_10
value: 72.928
- type: ndcg_at_100
value: 75.778
- type: ndcg_at_1000
value: 76.866
- type: ndcg_at_3
value: 68.46600000000001
- type: ndcg_at_5
value: 71.036
- type: precision_at_1
value: 83.038
- type: precision_at_10
value: 15.040999999999999
- type: precision_at_100
value: 1.7260000000000002
- type: precision_at_1000
value: 0.187
- type: precision_at_3
value: 43.597
- type: precision_at_5
value: 28.188999999999997
- type: recall_at_1
value: 41.519
- type: recall_at_10
value: 75.20599999999999
- type: recall_at_100
value: 86.3
- type: recall_at_1000
value: 93.437
- type: recall_at_3
value: 65.39500000000001
- type: recall_at_5
value: 70.473
- task:
type: Classification
dataset:
type: mteb/imdb
name: MTEB ImdbClassification
config: default
split: test
revision: 3d86128a09e091d6018b6d26cad27f2739fc2db7
metrics:
- type: accuracy
value: 96.0428
- type: ap
value: 94.48278082595033
- type: f1
value: 96.0409595432081
- task:
type: Retrieval
dataset:
type: mteb/msmarco
name: MTEB MSMARCO
config: default
split: dev
revision: c5a29a104738b98a9e76336939199e264163d4a0
metrics:
- type: map_at_1
value: 21.496000000000002
- type: map_at_10
value: 33.82
- type: map_at_100
value: 35.013
- type: map_at_1000
value: 35.063
- type: map_at_3
value: 29.910999999999998
- type: map_at_5
value: 32.086
- type: mrr_at_1
value: 22.092
- type: mrr_at_10
value: 34.404
- type: mrr_at_100
value: 35.534
- type: mrr_at_1000
value: 35.577999999999996
- type: mrr_at_3
value: 30.544
- type: mrr_at_5
value: 32.711
- type: ndcg_at_1
value: 22.092
- type: ndcg_at_10
value: 40.877
- type: ndcg_at_100
value: 46.619
- type: ndcg_at_1000
value: 47.823
- type: ndcg_at_3
value: 32.861000000000004
- type: ndcg_at_5
value: 36.769
- type: precision_at_1
value: 22.092
- type: precision_at_10
value: 6.54
- type: precision_at_100
value: 0.943
- type: precision_at_1000
value: 0.105
- type: precision_at_3
value: 14.069
- type: precision_at_5
value: 10.424
- type: recall_at_1
value: 21.496000000000002
- type: recall_at_10
value: 62.67
- type: recall_at_100
value: 89.24499999999999
- type: recall_at_1000
value: 98.312
- type: recall_at_3
value: 40.796
- type: recall_at_5
value: 50.21600000000001
- task:
type: Classification
dataset:
type: mteb/mtop_domain
name: MTEB MTOPDomainClassification (en)
config: en
split: test
revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
metrics:
- type: accuracy
value: 95.74555403556772
- type: f1
value: 95.61381879323093
- task:
type: Classification
dataset:
type: mteb/mtop_intent
name: MTEB MTOPIntentClassification (en)
config: en
split: test
revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
metrics:
- type: accuracy
value: 85.82763337893297
- type: f1
value: 63.17139719465236
- task:
type: Classification
dataset:
type: mteb/amazon_massive_intent
name: MTEB MassiveIntentClassification (en)
config: en
split: test
revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
metrics:
- type: accuracy
value: 78.51714862138535
- type: f1
value: 76.3995118440293
- task:
type: Classification
dataset:
type: mteb/amazon_massive_scenario
name: MTEB MassiveScenarioClassification (en)
config: en
split: test
revision: 7d571f92784cd94a019292a1f45445077d0ef634
metrics:
- type: accuracy
value: 80.03698722259583
- type: f1
value: 79.36511484240766
- task:
type: Clustering
dataset:
type: mteb/medrxiv-clustering-p2p
name: MTEB MedrxivClusteringP2P
config: default
split: test
revision: e7a26af6f3ae46b30dde8737f02c07b1505bcc73
metrics:
- type: v_measure
value: 38.68901889835701
- task:
type: Clustering
dataset:
type: mteb/medrxiv-clustering-s2s
name: MTEB MedrxivClusteringS2S
config: default
split: test
revision: 35191c8c0dca72d8ff3efcd72aa802307d469663
metrics:
- type: v_measure
value: 38.0740589898848
- task:
type: Reranking
dataset:
type: mteb/mind_small
name: MTEB MindSmallReranking
config: default
split: test
revision: 3bdac13927fdc888b903db93b2ffdbd90b295a69
metrics:
- type: map
value: 33.41312482460189
- type: mrr
value: 34.713530863302495
- task:
type: Retrieval
dataset:
type: mteb/nfcorpus
name: MTEB NFCorpus
config: default
split: test
revision: ec0fa4fe99da2ff19ca1214b7966684033a58814
metrics:
- type: map_at_1
value: 6.232
- type: map_at_10
value: 13.442000000000002
- type: map_at_100
value: 17.443
- type: map_at_1000
value: 19.1
- type: map_at_3
value: 9.794
- type: map_at_5
value: 11.375
- type: mrr_at_1
value: 50.15500000000001
- type: mrr_at_10
value: 58.628
- type: mrr_at_100
value: 59.077
- type: mrr_at_1000
value: 59.119
- type: mrr_at_3
value: 56.914
- type: mrr_at_5
value: 57.921
- type: ndcg_at_1
value: 48.762
- type: ndcg_at_10
value: 37.203
- type: ndcg_at_100
value: 34.556
- type: ndcg_at_1000
value: 43.601
- type: ndcg_at_3
value: 43.004
- type: ndcg_at_5
value: 40.181
- type: precision_at_1
value: 50.15500000000001
- type: precision_at_10
value: 27.276
- type: precision_at_100
value: 8.981
- type: precision_at_1000
value: 2.228
- type: precision_at_3
value: 39.628
- type: precision_at_5
value: 33.808
- type: recall_at_1
value: 6.232
- type: recall_at_10
value: 18.137
- type: recall_at_100
value: 36.101
- type: recall_at_1000
value: 68.733
- type: recall_at_3
value: 10.978
- type: recall_at_5
value: 13.718
- task:
type: Retrieval
dataset:
type: mteb/nq
name: MTEB NQ
config: default
split: test
revision: b774495ed302d8c44a3a7ea25c90dbce03968f31
metrics:
- type: map_at_1
value: 35.545
- type: map_at_10
value: 52.083
- type: map_at_100
value: 52.954
- type: map_at_1000
value: 52.96999999999999
- type: map_at_3
value: 47.508
- type: map_at_5
value: 50.265
- type: mrr_at_1
value: 40.122
- type: mrr_at_10
value: 54.567
- type: mrr_at_100
value: 55.19199999999999
- type: mrr_at_1000
value: 55.204
- type: mrr_at_3
value: 51.043000000000006
- type: mrr_at_5
value: 53.233
- type: ndcg_at_1
value: 40.122
- type: ndcg_at_10
value: 60.012
- type: ndcg_at_100
value: 63.562
- type: ndcg_at_1000
value: 63.94
- type: ndcg_at_3
value: 51.681
- type: ndcg_at_5
value: 56.154
- type: precision_at_1
value: 40.122
- type: precision_at_10
value: 9.774
- type: precision_at_100
value: 1.176
- type: precision_at_1000
value: 0.121
- type: precision_at_3
value: 23.426
- type: precision_at_5
value: 16.686
- type: recall_at_1
value: 35.545
- type: recall_at_10
value: 81.557
- type: recall_at_100
value: 96.729
- type: recall_at_1000
value: 99.541
- type: recall_at_3
value: 60.185
- type: recall_at_5
value: 70.411
- task:
type: Retrieval
dataset:
type: mteb/quora
name: MTEB QuoraRetrieval
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 68.908
- type: map_at_10
value: 83.19
- type: map_at_100
value: 83.842
- type: map_at_1000
value: 83.858
- type: map_at_3
value: 80.167
- type: map_at_5
value: 82.053
- type: mrr_at_1
value: 79.46
- type: mrr_at_10
value: 86.256
- type: mrr_at_100
value: 86.37
- type: mrr_at_1000
value: 86.371
- type: mrr_at_3
value: 85.177
- type: mrr_at_5
value: 85.908
- type: ndcg_at_1
value: 79.5
- type: ndcg_at_10
value: 87.244
- type: ndcg_at_100
value: 88.532
- type: ndcg_at_1000
value: 88.626
- type: ndcg_at_3
value: 84.161
- type: ndcg_at_5
value: 85.835
- type: precision_at_1
value: 79.5
- type: precision_at_10
value: 13.339
- type: precision_at_100
value: 1.53
- type: precision_at_1000
value: 0.157
- type: precision_at_3
value: 36.97
- type: precision_at_5
value: 24.384
- type: recall_at_1
value: 68.908
- type: recall_at_10
value: 95.179
- type: recall_at_100
value: 99.579
- type: recall_at_1000
value: 99.964
- type: recall_at_3
value: 86.424
- type: recall_at_5
value: 91.065
- task:
type: Clustering
dataset:
type: mteb/reddit-clustering
name: MTEB RedditClustering
config: default
split: test
revision: 24640382cdbf8abc73003fb0fa6d111a705499eb
metrics:
- type: v_measure
value: 65.17897847862794
- task:
type: Clustering
dataset:
type: mteb/reddit-clustering-p2p
name: MTEB RedditClusteringP2P
config: default
split: test
revision: 282350215ef01743dc01b456c7f5241fa8937f16
metrics:
- type: v_measure
value: 66.22194961632586
- task:
type: Retrieval
dataset:
type: mteb/scidocs
name: MTEB SCIDOCS
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 5.668
- type: map_at_10
value: 13.921
- type: map_at_100
value: 16.391
- type: map_at_1000
value: 16.749
- type: map_at_3
value: 10.001999999999999
- type: map_at_5
value: 11.974
- type: mrr_at_1
value: 27.800000000000004
- type: mrr_at_10
value: 39.290000000000006
- type: mrr_at_100
value: 40.313
- type: mrr_at_1000
value: 40.355999999999995
- type: mrr_at_3
value: 35.667
- type: mrr_at_5
value: 37.742
- type: ndcg_at_1
value: 27.800000000000004
- type: ndcg_at_10
value: 23.172
- type: ndcg_at_100
value: 32.307
- type: ndcg_at_1000
value: 38.048
- type: ndcg_at_3
value: 22.043
- type: ndcg_at_5
value: 19.287000000000003
- type: precision_at_1
value: 27.800000000000004
- type: precision_at_10
value: 11.95
- type: precision_at_100
value: 2.5260000000000002
- type: precision_at_1000
value: 0.38999999999999996
- type: precision_at_3
value: 20.433
- type: precision_at_5
value: 16.84
- type: recall_at_1
value: 5.668
- type: recall_at_10
value: 24.22
- type: recall_at_100
value: 51.217
- type: recall_at_1000
value: 79.10000000000001
- type: recall_at_3
value: 12.443
- type: recall_at_5
value: 17.068
- task:
type: STS
dataset:
type: mteb/sickr-sts
name: MTEB SICK-R
config: default
split: test
revision: a6ea5a8cab320b040a23452cc28066d9beae2cee
metrics:
- type: cos_sim_pearson
value: 82.83535239748218
- type: cos_sim_spearman
value: 73.98553311584509
- type: euclidean_pearson
value: 79.57336200069007
- type: euclidean_spearman
value: 73.98553926018461
- type: manhattan_pearson
value: 79.02277757114132
- type: manhattan_spearman
value: 73.52350678760683
- task:
type: STS
dataset:
type: mteb/sts12-sts
name: MTEB STS12
config: default
split: test
revision: a0d554a64d88156834ff5ae9920b964011b16384
metrics:
- type: cos_sim_pearson
value: 81.99055838690317
- type: cos_sim_spearman
value: 72.05290668592296
- type: euclidean_pearson
value: 81.7130610313565
- type: euclidean_spearman
value: 72.0529066787229
- type: manhattan_pearson
value: 82.09213883730894
- type: manhattan_spearman
value: 72.5171577483134
- task:
type: STS
dataset:
type: mteb/sts13-sts
name: MTEB STS13
config: default
split: test
revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
metrics:
- type: cos_sim_pearson
value: 84.4685161191763
- type: cos_sim_spearman
value: 84.4847436140129
- type: euclidean_pearson
value: 84.05016757016948
- type: euclidean_spearman
value: 84.48474353891532
- type: manhattan_pearson
value: 83.83064062713048
- type: manhattan_spearman
value: 84.30431591842805
- task:
type: STS
dataset:
type: mteb/sts14-sts
name: MTEB STS14
config: default
split: test
revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
metrics:
- type: cos_sim_pearson
value: 83.00171021092486
- type: cos_sim_spearman
value: 77.91329577609622
- type: euclidean_pearson
value: 81.49758593915315
- type: euclidean_spearman
value: 77.91329577609622
- type: manhattan_pearson
value: 81.23255996803785
- type: manhattan_spearman
value: 77.80027024941825
- task:
type: STS
dataset:
type: mteb/sts15-sts
name: MTEB STS15
config: default
split: test
revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
metrics:
- type: cos_sim_pearson
value: 86.62608607472492
- type: cos_sim_spearman
value: 87.62293916855751
- type: euclidean_pearson
value: 87.04313886714989
- type: euclidean_spearman
value: 87.62293907119869
- type: manhattan_pearson
value: 86.97266321040769
- type: manhattan_spearman
value: 87.61807042381702
- task:
type: STS
dataset:
type: mteb/sts16-sts
name: MTEB STS16
config: default
split: test
revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
metrics:
- type: cos_sim_pearson
value: 80.8012095789289
- type: cos_sim_spearman
value: 81.91868918081325
- type: euclidean_pearson
value: 81.2267973811213
- type: euclidean_spearman
value: 81.91868918081325
- type: manhattan_pearson
value: 81.0173457901168
- type: manhattan_spearman
value: 81.79743115887055
- task:
type: STS
dataset:
type: mteb/sts17-crosslingual-sts
name: MTEB STS17 (en-en)
config: en-en
split: test
revision: af5e6fb845001ecf41f4c1e033ce921939a2a68d
metrics:
- type: cos_sim_pearson
value: 88.39698537303725
- type: cos_sim_spearman
value: 88.78668529808967
- type: euclidean_pearson
value: 88.78863351718252
- type: euclidean_spearman
value: 88.78668529808967
- type: manhattan_pearson
value: 88.41678215762478
- type: manhattan_spearman
value: 88.3827998418763
- task:
type: STS
dataset:
type: mteb/sts22-crosslingual-sts
name: MTEB STS22 (en)
config: en
split: test
revision: eea2b4fe26a775864c896887d910b76a8098ad3f
metrics:
- type: cos_sim_pearson
value: 68.49024974161408
- type: cos_sim_spearman
value: 69.19917146180619
- type: euclidean_pearson
value: 70.48882819806336
- type: euclidean_spearman
value: 69.19917146180619
- type: manhattan_pearson
value: 70.86827961779932
- type: manhattan_spearman
value: 69.38456983992613
- task:
type: STS
dataset:
type: mteb/stsbenchmark-sts
name: MTEB STSBenchmark
config: default
split: test
revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
metrics:
- type: cos_sim_pearson
value: 84.31376078795105
- type: cos_sim_spearman
value: 83.3985199217591
- type: euclidean_pearson
value: 84.06630133719332
- type: euclidean_spearman
value: 83.3985199217591
- type: manhattan_pearson
value: 83.7896654474364
- type: manhattan_spearman
value: 83.1885039212299
- task:
type: Reranking
dataset:
type: mteb/scidocs-reranking
name: MTEB SciDocsRR
config: default
split: test
revision: d3c5e1fc0b855ab6097bf1cda04dd73947d7caab
metrics:
- type: map
value: 85.83161002188668
- type: mrr
value: 96.19253114351153
- task:
type: Retrieval
dataset:
type: mteb/scifact
name: MTEB SciFact
config: default
split: test
revision: 0228b52cf27578f30900b9e5271d331663a030d7
metrics:
- type: map_at_1
value: 48.132999999999996
- type: map_at_10
value: 58.541
- type: map_at_100
value: 59.34
- type: map_at_1000
value: 59.367999999999995
- type: map_at_3
value: 55.191
- type: map_at_5
value: 57.084
- type: mrr_at_1
value: 51.0
- type: mrr_at_10
value: 59.858
- type: mrr_at_100
value: 60.474000000000004
- type: mrr_at_1000
value: 60.501000000000005
- type: mrr_at_3
value: 57.111000000000004
- type: mrr_at_5
value: 58.694
- type: ndcg_at_1
value: 51.0
- type: ndcg_at_10
value: 63.817
- type: ndcg_at_100
value: 67.229
- type: ndcg_at_1000
value: 67.94
- type: ndcg_at_3
value: 57.896
- type: ndcg_at_5
value: 60.785999999999994
- type: precision_at_1
value: 51.0
- type: precision_at_10
value: 8.933
- type: precision_at_100
value: 1.0699999999999998
- type: precision_at_1000
value: 0.11299999999999999
- type: precision_at_3
value: 23.111
- type: precision_at_5
value: 15.733
- type: recall_at_1
value: 48.132999999999996
- type: recall_at_10
value: 78.922
- type: recall_at_100
value: 94.167
- type: recall_at_1000
value: 99.667
- type: recall_at_3
value: 62.806
- type: recall_at_5
value: 70.078
- task:
type: PairClassification
dataset:
type: mteb/sprintduplicatequestions-pairclassification
name: MTEB SprintDuplicateQuestions
config: default
split: test
revision: d66bd1f72af766a5cc4b0ca5e00c162f89e8cc46
metrics:
- type: cos_sim_accuracy
value: 99.88415841584158
- type: cos_sim_ap
value: 97.72557886493401
- type: cos_sim_f1
value: 94.1294530858003
- type: cos_sim_precision
value: 94.46122860020141
- type: cos_sim_recall
value: 93.8
- type: dot_accuracy
value: 99.88415841584158
- type: dot_ap
value: 97.72557439066108
- type: dot_f1
value: 94.1294530858003
- type: dot_precision
value: 94.46122860020141
- type: dot_recall
value: 93.8
- type: euclidean_accuracy
value: 99.88415841584158
- type: euclidean_ap
value: 97.72557439066108
- type: euclidean_f1
value: 94.1294530858003
- type: euclidean_precision
value: 94.46122860020141
- type: euclidean_recall
value: 93.8
- type: manhattan_accuracy
value: 99.88514851485148
- type: manhattan_ap
value: 97.73324334051959
- type: manhattan_f1
value: 94.1825476429288
- type: manhattan_precision
value: 94.46680080482898
- type: manhattan_recall
value: 93.89999999999999
- type: max_accuracy
value: 99.88514851485148
- type: max_ap
value: 97.73324334051959
- type: max_f1
value: 94.1825476429288
- task:
type: Clustering
dataset:
type: mteb/stackexchange-clustering
name: MTEB StackExchangeClustering
config: default
split: test
revision: 6cbc1f7b2bc0622f2e39d2c77fa502909748c259
metrics:
- type: v_measure
value: 72.8168026381278
- task:
type: Clustering
dataset:
type: mteb/stackexchange-clustering-p2p
name: MTEB StackExchangeClusteringP2P
config: default
split: test
revision: 815ca46b2622cec33ccafc3735d572c266efdb44
metrics:
- type: v_measure
value: 44.30948635130784
- task:
type: Reranking
dataset:
type: mteb/stackoverflowdupquestions-reranking
name: MTEB StackOverflowDupQuestions
config: default
split: test
revision: e185fbe320c72810689fc5848eb6114e1ef5ec69
metrics:
- type: map
value: 54.11268548719803
- type: mrr
value: 55.08079747050335
- task:
type: Summarization
dataset:
type: mteb/summeval
name: MTEB SummEval
config: default
split: test
revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
metrics:
- type: cos_sim_pearson
value: 30.82885852096243
- type: cos_sim_spearman
value: 30.800770979226076
- type: dot_pearson
value: 30.82885608827704
- type: dot_spearman
value: 30.800770979226076
- task:
type: Retrieval
dataset:
type: mteb/trec-covid
name: MTEB TRECCOVID
config: default
split: test
revision: None
metrics:
- type: map_at_1
value: 0.20400000000000001
- type: map_at_10
value: 1.27
- type: map_at_100
value: 7.993
- type: map_at_1000
value: 20.934
- type: map_at_3
value: 0.469
- type: map_at_5
value: 0.716
- type: mrr_at_1
value: 76.0
- type: mrr_at_10
value: 84.967
- type: mrr_at_100
value: 84.967
- type: mrr_at_1000
value: 84.967
- type: mrr_at_3
value: 83.667
- type: mrr_at_5
value: 84.967
- type: ndcg_at_1
value: 69.0
- type: ndcg_at_10
value: 59.243
- type: ndcg_at_100
value: 48.784
- type: ndcg_at_1000
value: 46.966
- type: ndcg_at_3
value: 64.14
- type: ndcg_at_5
value: 61.60600000000001
- type: precision_at_1
value: 76.0
- type: precision_at_10
value: 62.6
- type: precision_at_100
value: 50.18
- type: precision_at_1000
value: 21.026
- type: precision_at_3
value: 68.667
- type: precision_at_5
value: 66.0
- type: recall_at_1
value: 0.20400000000000001
- type: recall_at_10
value: 1.582
- type: recall_at_100
value: 11.988
- type: recall_at_1000
value: 44.994
- type: recall_at_3
value: 0.515
- type: recall_at_5
value: 0.844
- task:
type: Retrieval
dataset:
type: mteb/touche2020
name: MTEB Touche2020
config: default
split: test
revision: a34f9a33db75fa0cbb21bb5cfc3dae8dc8bec93f
metrics:
- type: map_at_1
value: 3.3009999999999997
- type: map_at_10
value: 11.566
- type: map_at_100
value: 17.645
- type: map_at_1000
value: 19.206
- type: map_at_3
value: 6.986000000000001
- type: map_at_5
value: 8.716
- type: mrr_at_1
value: 42.857
- type: mrr_at_10
value: 58.287
- type: mrr_at_100
value: 59.111000000000004
- type: mrr_at_1000
value: 59.111000000000004
- type: mrr_at_3
value: 55.102
- type: mrr_at_5
value: 57.449
- type: ndcg_at_1
value: 39.796
- type: ndcg_at_10
value: 29.059
- type: ndcg_at_100
value: 40.629
- type: ndcg_at_1000
value: 51.446000000000005
- type: ndcg_at_3
value: 36.254999999999995
- type: ndcg_at_5
value: 32.216
- type: precision_at_1
value: 42.857
- type: precision_at_10
value: 23.469
- type: precision_at_100
value: 8.041
- type: precision_at_1000
value: 1.551
- type: precision_at_3
value: 36.735
- type: precision_at_5
value: 30.203999999999997
- type: recall_at_1
value: 3.3009999999999997
- type: recall_at_10
value: 17.267
- type: recall_at_100
value: 49.36
- type: recall_at_1000
value: 83.673
- type: recall_at_3
value: 8.049000000000001
- type: recall_at_5
value: 11.379999999999999
- task:
type: Classification
dataset:
type: mteb/toxic_conversations_50k
name: MTEB ToxicConversationsClassification
config: default
split: test
revision: d7c0de2777da35d6aae2200a62c6e0e5af397c4c
metrics:
- type: accuracy
value: 88.7576
- type: ap
value: 35.52110634325751
- type: f1
value: 74.14476947482417
- task:
type: Classification
dataset:
type: mteb/tweet_sentiment_extraction
name: MTEB TweetSentimentExtractionClassification
config: default
split: test
revision: d604517c81ca91fe16a244d1248fc021f9ecee7a
metrics:
- type: accuracy
value: 73.52009054895304
- type: f1
value: 73.81407409876577
- task:
type: Clustering
dataset:
type: mteb/twentynewsgroups-clustering
name: MTEB TwentyNewsgroupsClustering
config: default
split: test
revision: 6125ec4e24fa026cec8a478383ee943acfbd5449
metrics:
- type: v_measure
value: 54.35358706465052
- task:
type: PairClassification
dataset:
type: mteb/twittersemeval2015-pairclassification
name: MTEB TwitterSemEval2015
config: default
split: test
revision: 70970daeab8776df92f5ea462b6173c0b46fd2d1
metrics:
- type: cos_sim_accuracy
value: 83.65619598259522
- type: cos_sim_ap
value: 65.824087818991
- type: cos_sim_f1
value: 61.952620244077536
- type: cos_sim_precision
value: 56.676882661996494
- type: cos_sim_recall
value: 68.311345646438
- type: dot_accuracy
value: 83.65619598259522
- type: dot_ap
value: 65.82406256999921
- type: dot_f1
value: 61.952620244077536
- type: dot_precision
value: 56.676882661996494
- type: dot_recall
value: 68.311345646438
- type: euclidean_accuracy
value: 83.65619598259522
- type: euclidean_ap
value: 65.82409143427542
- type: euclidean_f1
value: 61.952620244077536
- type: euclidean_precision
value: 56.676882661996494
- type: euclidean_recall
value: 68.311345646438
- type: manhattan_accuracy
value: 83.4296954163438
- type: manhattan_ap
value: 65.20662449614932
- type: manhattan_f1
value: 61.352885525070946
- type: manhattan_precision
value: 55.59365623660523
- type: manhattan_recall
value: 68.44327176781002
- type: max_accuracy
value: 83.65619598259522
- type: max_ap
value: 65.82409143427542
- type: max_f1
value: 61.952620244077536
- task:
type: PairClassification
dataset:
type: mteb/twitterurlcorpus-pairclassification
name: MTEB TwitterURLCorpus
config: default
split: test
revision: 8b6510b0b1fa4e4c4f879467980e9be563ec1cdf
metrics:
- type: cos_sim_accuracy
value: 87.90119144642372
- type: cos_sim_ap
value: 84.04753852793387
- type: cos_sim_f1
value: 76.27737226277372
- type: cos_sim_precision
value: 73.86757068667052
- type: cos_sim_recall
value: 78.84970742223591
- type: dot_accuracy
value: 87.90119144642372
- type: dot_ap
value: 84.04753668117337
- type: dot_f1
value: 76.27737226277372
- type: dot_precision
value: 73.86757068667052
- type: dot_recall
value: 78.84970742223591
- type: euclidean_accuracy
value: 87.90119144642372
- type: euclidean_ap
value: 84.04754553468206
- type: euclidean_f1
value: 76.27737226277372
- type: euclidean_precision
value: 73.86757068667052
- type: euclidean_recall
value: 78.84970742223591
- type: manhattan_accuracy
value: 87.87014398261343
- type: manhattan_ap
value: 84.05164646221583
- type: manhattan_f1
value: 76.31392706820128
- type: manhattan_precision
value: 73.91586694566708
- type: manhattan_recall
value: 78.87280566676932
- type: max_accuracy
value: 87.90119144642372
- type: max_ap
value: 84.05164646221583
- type: max_f1
value: 76.31392706820128
- task:
type: STS
dataset:
type: C-MTEB/AFQMC
name: MTEB AFQMC
config: default
split: validation
revision: b44c3b011063adb25877c13823db83bb193913c4
metrics:
- type: cos_sim_pearson
value: 52.3123511272669
- type: cos_sim_spearman
value: 55.73207493107254
- type: euclidean_pearson
value: 53.95847274621819
- type: euclidean_spearman
value: 55.73207493107254
- type: manhattan_pearson
value: 53.720688490931124
- type: manhattan_spearman
value: 55.453911938689
- task:
type: STS
dataset:
type: C-MTEB/ATEC
name: MTEB ATEC
config: default
split: test
revision: 0f319b1142f28d00e055a6770f3f726ae9b7d865
metrics:
- type: cos_sim_pearson
value: 50.787428883419864
- type: cos_sim_spearman
value: 53.97343607668934
- type: euclidean_pearson
value: 55.12379889727461
- type: euclidean_spearman
value: 53.97343945403084
- type: manhattan_pearson
value: 54.95369694130932
- type: manhattan_spearman
value: 53.74165246349166
- task:
type: Classification
dataset:
type: mteb/amazon_reviews_multi
name: MTEB AmazonReviewsClassification (zh)
config: zh
split: test
revision: 1399c76144fd37290681b995c656ef9b2e06e26d
metrics:
- type: accuracy
value: 53.49
- type: f1
value: 51.576550662258434
- task:
type: STS
dataset:
type: C-MTEB/BQ
name: MTEB BQ
config: default
split: test
revision: e3dda5e115e487b39ec7e618c0c6a29137052a55
metrics:
- type: cos_sim_pearson
value: 63.78770644319529
- type: cos_sim_spearman
value: 65.08813140587463
- type: euclidean_pearson
value: 63.92948559310832
- type: euclidean_spearman
value: 65.08813486997627
- type: manhattan_pearson
value: 63.55967028084246
- type: manhattan_spearman
value: 64.69692694499825
- task:
type: Clustering
dataset:
type: C-MTEB/CLSClusteringP2P
name: MTEB CLSClusteringP2P
config: default
split: test
revision: 4b6227591c6c1a73bc76b1055f3b7f3588e72476
metrics:
- type: v_measure
value: 44.23533333311907
- task:
type: Clustering
dataset:
type: C-MTEB/CLSClusteringS2S
name: MTEB CLSClusteringS2S
config: default
split: test
revision: e458b3f5414b62b7f9f83499ac1f5497ae2e869f
metrics:
- type: v_measure
value: 43.01114481307774
- task:
type: Reranking
dataset:
type: C-MTEB/CMedQAv1-reranking
name: MTEB CMedQAv1
config: default
split: test
revision: 8d7f1e942507dac42dc58017c1a001c3717da7df
metrics:
- type: map
value: 86.4349853821696
- type: mrr
value: 88.80150793650795
- task:
type: Reranking
dataset:
type: C-MTEB/CMedQAv2-reranking
name: MTEB CMedQAv2
config: default
split: test
revision: 23d186750531a14a0357ca22cd92d712fd512ea0
metrics:
- type: map
value: 87.56417400982208
- type: mrr
value: 89.85813492063491
- task:
type: Retrieval
dataset:
type: C-MTEB/CmedqaRetrieval
name: MTEB CmedqaRetrieval
config: default
split: dev
revision: cd540c506dae1cf9e9a59c3e06f42030d54e7301
metrics:
- type: map_at_1
value: 24.871
- type: map_at_10
value: 37.208999999999996
- type: map_at_100
value: 38.993
- type: map_at_1000
value: 39.122
- type: map_at_3
value: 33.2
- type: map_at_5
value: 35.33
- type: mrr_at_1
value: 37.884
- type: mrr_at_10
value: 46.189
- type: mrr_at_100
value: 47.147
- type: mrr_at_1000
value: 47.195
- type: mrr_at_3
value: 43.728
- type: mrr_at_5
value: 44.994
- type: ndcg_at_1
value: 37.884
- type: ndcg_at_10
value: 43.878
- type: ndcg_at_100
value: 51.002
- type: ndcg_at_1000
value: 53.161
- type: ndcg_at_3
value: 38.729
- type: ndcg_at_5
value: 40.628
- type: precision_at_1
value: 37.884
- type: precision_at_10
value: 9.75
- type: precision_at_100
value: 1.558
- type: precision_at_1000
value: 0.183
- type: precision_at_3
value: 21.964
- type: precision_at_5
value: 15.719
- type: recall_at_1
value: 24.871
- type: recall_at_10
value: 54.615
- type: recall_at_100
value: 84.276
- type: recall_at_1000
value: 98.578
- type: recall_at_3
value: 38.936
- type: recall_at_5
value: 45.061
- task:
type: PairClassification
dataset:
type: C-MTEB/CMNLI
name: MTEB Cmnli
config: default
split: validation
revision: 41bc36f332156f7adc9e38f53777c959b2ae9766
metrics:
- type: cos_sim_accuracy
value: 76.12748045700542
- type: cos_sim_ap
value: 84.47948419710998
- type: cos_sim_f1
value: 77.88108108108108
- type: cos_sim_precision
value: 72.43112809169516
- type: cos_sim_recall
value: 84.21790974982464
- type: dot_accuracy
value: 76.12748045700542
- type: dot_ap
value: 84.4933237839786
- type: dot_f1
value: 77.88108108108108
- type: dot_precision
value: 72.43112809169516
- type: dot_recall
value: 84.21790974982464
- type: euclidean_accuracy
value: 76.12748045700542
- type: euclidean_ap
value: 84.47947997540409
- type: euclidean_f1
value: 77.88108108108108
- type: euclidean_precision
value: 72.43112809169516
- type: euclidean_recall
value: 84.21790974982464
- type: manhattan_accuracy
value: 75.40589296452195
- type: manhattan_ap
value: 83.74383956930585
- type: manhattan_f1
value: 77.0983342289092
- type: manhattan_precision
value: 71.34049323786795
- type: manhattan_recall
value: 83.86719663315408
- type: max_accuracy
value: 76.12748045700542
- type: max_ap
value: 84.4933237839786
- type: max_f1
value: 77.88108108108108
- task:
type: Retrieval
dataset:
type: C-MTEB/CovidRetrieval
name: MTEB CovidRetrieval
config: default
split: dev
revision: 1271c7809071a13532e05f25fb53511ffce77117
metrics:
- type: map_at_1
value: 66.781
- type: map_at_10
value: 74.539
- type: map_at_100
value: 74.914
- type: map_at_1000
value: 74.921
- type: map_at_3
value: 72.734
- type: map_at_5
value: 73.788
- type: mrr_at_1
value: 66.913
- type: mrr_at_10
value: 74.543
- type: mrr_at_100
value: 74.914
- type: mrr_at_1000
value: 74.921
- type: mrr_at_3
value: 72.831
- type: mrr_at_5
value: 73.76899999999999
- type: ndcg_at_1
value: 67.018
- type: ndcg_at_10
value: 78.34299999999999
- type: ndcg_at_100
value: 80.138
- type: ndcg_at_1000
value: 80.322
- type: ndcg_at_3
value: 74.667
- type: ndcg_at_5
value: 76.518
- type: precision_at_1
value: 67.018
- type: precision_at_10
value: 9.115
- type: precision_at_100
value: 0.996
- type: precision_at_1000
value: 0.101
- type: precision_at_3
value: 26.906000000000002
- type: precision_at_5
value: 17.092
- type: recall_at_1
value: 66.781
- type: recall_at_10
value: 90.253
- type: recall_at_100
value: 98.52499999999999
- type: recall_at_1000
value: 100.0
- type: recall_at_3
value: 80.05799999999999
- type: recall_at_5
value: 84.615
- task:
type: Retrieval
dataset:
type: C-MTEB/DuRetrieval
name: MTEB DuRetrieval
config: default
split: dev
revision: a1a333e290fe30b10f3f56498e3a0d911a693ced
metrics:
- type: map_at_1
value: 24.528
- type: map_at_10
value: 76.304
- type: map_at_100
value: 79.327
- type: map_at_1000
value: 79.373
- type: map_at_3
value: 52.035
- type: map_at_5
value: 66.074
- type: mrr_at_1
value: 86.05000000000001
- type: mrr_at_10
value: 90.74
- type: mrr_at_100
value: 90.809
- type: mrr_at_1000
value: 90.81099999999999
- type: mrr_at_3
value: 90.30799999999999
- type: mrr_at_5
value: 90.601
- type: ndcg_at_1
value: 86.05000000000001
- type: ndcg_at_10
value: 84.518
- type: ndcg_at_100
value: 87.779
- type: ndcg_at_1000
value: 88.184
- type: ndcg_at_3
value: 82.339
- type: ndcg_at_5
value: 81.613
- type: precision_at_1
value: 86.05000000000001
- type: precision_at_10
value: 40.945
- type: precision_at_100
value: 4.787
- type: precision_at_1000
value: 0.48900000000000005
- type: precision_at_3
value: 74.117
- type: precision_at_5
value: 62.86000000000001
- type: recall_at_1
value: 24.528
- type: recall_at_10
value: 86.78
- type: recall_at_100
value: 97.198
- type: recall_at_1000
value: 99.227
- type: recall_at_3
value: 54.94799999999999
- type: recall_at_5
value: 72.053
- task:
type: Retrieval
dataset:
type: C-MTEB/EcomRetrieval
name: MTEB EcomRetrieval
config: default
split: dev
revision: 687de13dc7294d6fd9be10c6945f9e8fec8166b9
metrics:
- type: map_at_1
value: 52.1
- type: map_at_10
value: 62.502
- type: map_at_100
value: 63.026
- type: map_at_1000
value: 63.04
- type: map_at_3
value: 59.782999999999994
- type: map_at_5
value: 61.443000000000005
- type: mrr_at_1
value: 52.1
- type: mrr_at_10
value: 62.502
- type: mrr_at_100
value: 63.026
- type: mrr_at_1000
value: 63.04
- type: mrr_at_3
value: 59.782999999999994
- type: mrr_at_5
value: 61.443000000000005
- type: ndcg_at_1
value: 52.1
- type: ndcg_at_10
value: 67.75999999999999
- type: ndcg_at_100
value: 70.072
- type: ndcg_at_1000
value: 70.441
- type: ndcg_at_3
value: 62.28
- type: ndcg_at_5
value: 65.25800000000001
- type: precision_at_1
value: 52.1
- type: precision_at_10
value: 8.43
- type: precision_at_100
value: 0.946
- type: precision_at_1000
value: 0.098
- type: precision_at_3
value: 23.166999999999998
- type: precision_at_5
value: 15.340000000000002
- type: recall_at_1
value: 52.1
- type: recall_at_10
value: 84.3
- type: recall_at_100
value: 94.6
- type: recall_at_1000
value: 97.5
- type: recall_at_3
value: 69.5
- type: recall_at_5
value: 76.7
- task:
type: Classification
dataset:
type: C-MTEB/IFlyTek-classification
name: MTEB IFlyTek
config: default
split: validation
revision: 421605374b29664c5fc098418fe20ada9bd55f8a
metrics:
- type: accuracy
value: 52.04309349749903
- type: f1
value: 39.91893257315586
- task:
type: Classification
dataset:
type: C-MTEB/JDReview-classification
name: MTEB JDReview
config: default
split: test
revision: b7c64bd89eb87f8ded463478346f76731f07bf8b
metrics:
- type: accuracy
value: 85.60975609756099
- type: ap
value: 54.30148799475452
- type: f1
value: 80.55899583002706
- task:
type: STS
dataset:
type: C-MTEB/LCQMC
name: MTEB LCQMC
config: default
split: test
revision: 17f9b096f80380fce5ed12a9be8be7784b337daf
metrics:
- type: cos_sim_pearson
value: 66.80471387011771
- type: cos_sim_spearman
value: 72.69179486905233
- type: euclidean_pearson
value: 71.32341962627513
- type: euclidean_spearman
value: 72.69179043377405
- type: manhattan_pearson
value: 71.06180379791572
- type: manhattan_spearman
value: 72.400125270369
- task:
type: Reranking
dataset:
type: C-MTEB/Mmarco-reranking
name: MTEB MMarcoReranking
config: default
split: dev
revision: 8e0c766dbe9e16e1d221116a3f36795fbade07f6
metrics:
- type: map
value: 27.9616280919871
- type: mrr
value: 26.544047619047618
- task:
type: Retrieval
dataset:
type: C-MTEB/MMarcoRetrieval
name: MTEB MMarcoRetrieval
config: default
split: dev
revision: 539bbde593d947e2a124ba72651aafc09eb33fc2
metrics:
- type: map_at_1
value: 68.32300000000001
- type: map_at_10
value: 77.187
- type: map_at_100
value: 77.496
- type: map_at_1000
value: 77.503
- type: map_at_3
value: 75.405
- type: map_at_5
value: 76.539
- type: mrr_at_1
value: 70.616
- type: mrr_at_10
value: 77.703
- type: mrr_at_100
value: 77.97699999999999
- type: mrr_at_1000
value: 77.984
- type: mrr_at_3
value: 76.139
- type: mrr_at_5
value: 77.125
- type: ndcg_at_1
value: 70.616
- type: ndcg_at_10
value: 80.741
- type: ndcg_at_100
value: 82.123
- type: ndcg_at_1000
value: 82.32300000000001
- type: ndcg_at_3
value: 77.35600000000001
- type: ndcg_at_5
value: 79.274
- type: precision_at_1
value: 70.616
- type: precision_at_10
value: 9.696
- type: precision_at_100
value: 1.038
- type: precision_at_1000
value: 0.106
- type: precision_at_3
value: 29.026000000000003
- type: precision_at_5
value: 18.433
- type: recall_at_1
value: 68.32300000000001
- type: recall_at_10
value: 91.186
- type: recall_at_100
value: 97.439
- type: recall_at_1000
value: 99.004
- type: recall_at_3
value: 82.218
- type: recall_at_5
value: 86.797
- task:
type: Classification
dataset:
type: mteb/amazon_massive_intent
name: MTEB MassiveIntentClassification (zh-CN)
config: zh-CN
split: test
revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
metrics:
- type: accuracy
value: 74.78143913920646
- type: f1
value: 72.6141122227626
- task:
type: Classification
dataset:
type: mteb/amazon_massive_scenario
name: MTEB MassiveScenarioClassification (zh-CN)
config: zh-CN
split: test
revision: 7d571f92784cd94a019292a1f45445077d0ef634
metrics:
- type: accuracy
value: 76.98722259583053
- type: f1
value: 76.5974920207624
- task:
type: Retrieval
dataset:
type: C-MTEB/MedicalRetrieval
name: MTEB MedicalRetrieval
config: default
split: dev
revision: 2039188fb5800a9803ba5048df7b76e6fb151fc6
metrics:
- type: map_at_1
value: 51.800000000000004
- type: map_at_10
value: 57.938
- type: map_at_100
value: 58.494
- type: map_at_1000
value: 58.541
- type: map_at_3
value: 56.617
- type: map_at_5
value: 57.302
- type: mrr_at_1
value: 51.800000000000004
- type: mrr_at_10
value: 57.938
- type: mrr_at_100
value: 58.494
- type: mrr_at_1000
value: 58.541
- type: mrr_at_3
value: 56.617
- type: mrr_at_5
value: 57.302
- type: ndcg_at_1
value: 51.800000000000004
- type: ndcg_at_10
value: 60.891
- type: ndcg_at_100
value: 63.897000000000006
- type: ndcg_at_1000
value: 65.231
- type: ndcg_at_3
value: 58.108000000000004
- type: ndcg_at_5
value: 59.343
- type: precision_at_1
value: 51.800000000000004
- type: precision_at_10
value: 7.02
- type: precision_at_100
value: 0.8500000000000001
- type: precision_at_1000
value: 0.096
- type: precision_at_3
value: 20.8
- type: precision_at_5
value: 13.08
- type: recall_at_1
value: 51.800000000000004
- type: recall_at_10
value: 70.19999999999999
- type: recall_at_100
value: 85.0
- type: recall_at_1000
value: 95.7
- type: recall_at_3
value: 62.4
- type: recall_at_5
value: 65.4
- task:
type: Classification
dataset:
type: C-MTEB/MultilingualSentiment-classification
name: MTEB MultilingualSentiment
config: default
split: validation
revision: 46958b007a63fdbf239b7672c25d0bea67b5ea1a
metrics:
- type: accuracy
value: 80.39333333333335
- type: f1
value: 80.42683132366277
- task:
type: PairClassification
dataset:
type: C-MTEB/OCNLI
name: MTEB Ocnli
config: default
split: validation
revision: 66e76a618a34d6d565d5538088562851e6daa7ec
metrics:
- type: cos_sim_accuracy
value: 70.7634001082837
- type: cos_sim_ap
value: 74.97527385556558
- type: cos_sim_f1
value: 72.77277277277277
- type: cos_sim_precision
value: 69.17221693625119
- type: cos_sim_recall
value: 76.76874340021119
- type: dot_accuracy
value: 70.7634001082837
- type: dot_ap
value: 74.97527385556558
- type: dot_f1
value: 72.77277277277277
- type: dot_precision
value: 69.17221693625119
- type: dot_recall
value: 76.76874340021119
- type: euclidean_accuracy
value: 70.7634001082837
- type: euclidean_ap
value: 74.97527385556558
- type: euclidean_f1
value: 72.77277277277277
- type: euclidean_precision
value: 69.17221693625119
- type: euclidean_recall
value: 76.76874340021119
- type: manhattan_accuracy
value: 69.89713048186248
- type: manhattan_ap
value: 74.25943370061067
- type: manhattan_f1
value: 72.17268887846082
- type: manhattan_precision
value: 64.94932432432432
- type: manhattan_recall
value: 81.20380147835269
- type: max_accuracy
value: 70.7634001082837
- type: max_ap
value: 74.97527385556558
- type: max_f1
value: 72.77277277277277
- task:
type: Classification
dataset:
type: C-MTEB/OnlineShopping-classification
name: MTEB OnlineShopping
config: default
split: test
revision: e610f2ebd179a8fda30ae534c3878750a96db120
metrics:
- type: accuracy
value: 92.92000000000002
- type: ap
value: 91.98475625106201
- type: f1
value: 92.91841470541901
- task:
type: STS
dataset:
type: C-MTEB/PAWSX
name: MTEB PAWSX
config: default
split: test
revision: 9c6a90e430ac22b5779fb019a23e820b11a8b5e1
metrics:
- type: cos_sim_pearson
value: 14.383440096352668
- type: cos_sim_spearman
value: 16.306924065606417
- type: euclidean_pearson
value: 18.41761420026285
- type: euclidean_spearman
value: 16.306657048204574
- type: manhattan_pearson
value: 18.4377010794545
- type: manhattan_spearman
value: 16.36919038809279
- task:
type: STS
dataset:
type: C-MTEB/QBQTC
name: MTEB QBQTC
config: default
split: test
revision: 790b0510dc52b1553e8c49f3d2afb48c0e5c48b7
metrics:
- type: cos_sim_pearson
value: 31.95106420311818
- type: cos_sim_spearman
value: 34.89277148116508
- type: euclidean_pearson
value: 32.94933182954164
- type: euclidean_spearman
value: 34.89280064539983
- type: manhattan_pearson
value: 32.86089069741366
- type: manhattan_spearman
value: 34.7932921716507
- task:
type: STS
dataset:
type: mteb/sts22-crosslingual-sts
name: MTEB STS22 (zh)
config: zh
split: test
revision: eea2b4fe26a775864c896887d910b76a8098ad3f
metrics:
- type: cos_sim_pearson
value: 67.41628669863584
- type: cos_sim_spearman
value: 67.87238206703478
- type: euclidean_pearson
value: 67.67834985311778
- type: euclidean_spearman
value: 67.87238206703478
- type: manhattan_pearson
value: 68.23423896742973
- type: manhattan_spearman
value: 68.27069260687092
- task:
type: STS
dataset:
type: C-MTEB/STSB
name: MTEB STSB
config: default
split: test
revision: 0cde68302b3541bb8b3c340dc0644b0b745b3dc0
metrics:
- type: cos_sim_pearson
value: 77.31628954400037
- type: cos_sim_spearman
value: 76.83296022489624
- type: euclidean_pearson
value: 76.69680425261211
- type: euclidean_spearman
value: 76.83287843321102
- type: manhattan_pearson
value: 76.65603163327958
- type: manhattan_spearman
value: 76.80803503360451
- task:
type: Reranking
dataset:
type: C-MTEB/T2Reranking
name: MTEB T2Reranking
config: default
split: dev
revision: 76631901a18387f85eaa53e5450019b87ad58ef9
metrics:
- type: map
value: 66.73038448968596
- type: mrr
value: 77.26510193334836
- task:
type: Retrieval
dataset:
type: C-MTEB/T2Retrieval
name: MTEB T2Retrieval
config: default
split: dev
revision: 8731a845f1bf500a4f111cf1070785c793d10e64
metrics:
- type: map_at_1
value: 28.157
- type: map_at_10
value: 79.00399999999999
- type: map_at_100
value: 82.51899999999999
- type: map_at_1000
value: 82.577
- type: map_at_3
value: 55.614
- type: map_at_5
value: 68.292
- type: mrr_at_1
value: 91.167
- type: mrr_at_10
value: 93.391
- type: mrr_at_100
value: 93.467
- type: mrr_at_1000
value: 93.47
- type: mrr_at_3
value: 93.001
- type: mrr_at_5
value: 93.254
- type: ndcg_at_1
value: 91.167
- type: ndcg_at_10
value: 86.155
- type: ndcg_at_100
value: 89.425
- type: ndcg_at_1000
value: 89.983
- type: ndcg_at_3
value: 87.516
- type: ndcg_at_5
value: 86.148
- type: precision_at_1
value: 91.167
- type: precision_at_10
value: 42.697
- type: precision_at_100
value: 5.032
- type: precision_at_1000
value: 0.516
- type: precision_at_3
value: 76.45100000000001
- type: precision_at_5
value: 64.051
- type: recall_at_1
value: 28.157
- type: recall_at_10
value: 84.974
- type: recall_at_100
value: 95.759
- type: recall_at_1000
value: 98.583
- type: recall_at_3
value: 57.102
- type: recall_at_5
value: 71.383
- task:
type: Classification
dataset:
type: C-MTEB/TNews-classification
name: MTEB TNews
config: default
split: validation
revision: 317f262bf1e6126357bbe89e875451e4b0938fe4
metrics:
- type: accuracy
value: 55.031
- type: f1
value: 53.07992810732314
- task:
type: Clustering
dataset:
type: C-MTEB/ThuNewsClusteringP2P
name: MTEB ThuNewsClusteringP2P
config: default
split: test
revision: 5798586b105c0434e4f0fe5e767abe619442cf93
metrics:
- type: v_measure
value: 72.80915114296552
- task:
type: Clustering
dataset:
type: C-MTEB/ThuNewsClusteringS2S
name: MTEB ThuNewsClusteringS2S
config: default
split: test
revision: 8a8b2caeda43f39e13c4bc5bea0f8a667896e10d
metrics:
- type: v_measure
value: 70.86374654127641
- task:
type: Retrieval
dataset:
type: C-MTEB/VideoRetrieval
name: MTEB VideoRetrieval
config: default
split: dev
revision: 58c2597a5943a2ba48f4668c3b90d796283c5639
metrics:
- type: map_at_1
value: 63.6
- type: map_at_10
value: 72.673
- type: map_at_100
value: 73.05199999999999
- type: map_at_1000
value: 73.057
- type: map_at_3
value: 70.833
- type: map_at_5
value: 72.05799999999999
- type: mrr_at_1
value: 63.6
- type: mrr_at_10
value: 72.673
- type: mrr_at_100
value: 73.05199999999999
- type: mrr_at_1000
value: 73.057
- type: mrr_at_3
value: 70.833
- type: mrr_at_5
value: 72.05799999999999
- type: ndcg_at_1
value: 63.6
- type: ndcg_at_10
value: 76.776
- type: ndcg_at_100
value: 78.52900000000001
- type: ndcg_at_1000
value: 78.696
- type: ndcg_at_3
value: 73.093
- type: ndcg_at_5
value: 75.288
- type: precision_at_1
value: 63.6
- type: precision_at_10
value: 8.95
- type: precision_at_100
value: 0.975
- type: precision_at_1000
value: 0.099
- type: precision_at_3
value: 26.533
- type: precision_at_5
value: 16.98
- type: recall_at_1
value: 63.6
- type: recall_at_10
value: 89.5
- type: recall_at_100
value: 97.5
- type: recall_at_1000
value: 98.9
- type: recall_at_3
value: 79.60000000000001
- type: recall_at_5
value: 84.89999999999999
- task:
type: Classification
dataset:
type: C-MTEB/waimai-classification
name: MTEB Waimai
config: default
split: test
revision: 339287def212450dcaa9df8c22bf93e9980c7023
metrics:
- type: accuracy
value: 89.39999999999999
- type: ap
value: 75.52087544076016
- type: f1
value: 87.7629629899278
---
<p align="center">
<img src="images/gme_logo.png" alt="GME Logo" style="width: 100%; max-width: 450px;">
</p>
<p align="center"><b>GME: General Multimodal Embedding</b></p>
## GME-Qwen2-VL-2B
We are excited to present `GME-Qwen2VL` series of unified **multimodal embedding models**,
which are based on the advanced [Qwen2-VL](https://huggingface.co/collections/Qwen/qwen2-vl-66cee7455501d7126940800d) multimodal large language models (MLLMs).
The `GME` models support three types of input: **text**, **image**, and **image-text pair**, all of which can produce universal vector representations and have powerful retrieval performance.
**Key Enhancements of GME Models**:
- **Unified Multimodal Representation**: GME models can process both single-modal and combined-modal inputs, resulting in a unified vector representation. This enables versatile retrieval scenarios (Any2Any Search), supporting tasks such as text retrieval, image retrieval from text, and image-to-image searches.
- **High Performance**: Achieves state-of-the-art (SOTA) results in our universal multimodal retrieval benchmark (**UMRB**) and demonstrate strong evaluation scores in the Multimodal Textual Evaluation Benchmark (**MTEB**).
- **Dynamic Image Resolution**: Benefiting from `Qwen2-VL` and our training data, GME models support dynamic resolution image input.
- **Strong Visual Retrieval Performance**: Enhanced by the Qwen2-VL model series, our models excel in visual document retrieval tasks that require a nuanced understanding of document screenshots.
This capability is particularly beneficial for complex document understanding scenarios,
such as multimodal retrieval-augmented generation (RAG) applications focused on academic papers.
**Developed by**: Tongyi Lab, Alibaba Group
**Paper**: [GME: Improving Universal Multimodal Retrieval by Multimodal LLMs](http://arxiv.org/abs/2412.16855)
## Model List
| Models | Model Size | Max Seq. Length | Dimension | MTEB-en| MTEB-zh | UMRB |
|:-----: | :-----: |:-----: |:-----: |:-----: | :-----: |
|[`gme-Qwen2-VL-2B`](https://huggingface.co/Alibaba-NLP/gme-Qwen2-VL-2B-Instruct) | 2.21B | 32768 | 1536 | 65.27 | 66.92 | 64.45 |
|[`gme-Qwen2-VL-7B`](https://huggingface.co/Alibaba-NLP/gme-Qwen2-VL-7B-Instruct) | 8.29B | 32768 | 3584 | 67.48 | 69.73 | 67.44 |
## Usage
```
**Use with custom code**
```python
# You can find the script gme_inference.py in https://huggingface.co/Alibaba-NLP/gme-Qwen2VL-2B/blob/main/scripts/gme_inference.py
from gme_inference import GmeQwen2VL
texts = [
"What kind of car is this?",
"The Tesla Cybertruck is a battery electric pickup truck built by Tesla, Inc. since 2023."
]
images = [
'https://en.wikipedia.org/wiki/File:Tesla_Cybertruck_damaged_window.jpg',
'https://en.wikipedia.org/wiki/File:2024_Tesla_Cybertruck_Foundation_Series,_front_left_(Greenwich).jpg',
]
gme = GmeQwen2VL("Alibaba-NLP/gme-Qwen2-VL-2B-Instruct")
# Single-modal embedding
e_text = gme.get_text_embeddings(texts=texts)
e_image = gme.get_image_embeddings(images=images)
print((e_text * e_image).sum(-1))
## tensor([0.2281, 0.6001], dtype=torch.float16)
# How to set embedding instruction
e_query = gme.get_text_embeddings(texts=texts, instruction='Find an image that matches the given text.')
# If is_query=False, we always use the default instruction.
e_corpus = gme.get_image_embeddings(images=images, is_query=False)
print((e_query * e_corpus).sum(-1))
## tensor([0.2433, 0.7051], dtype=torch.float16)
# Fused-modal embedding
e_fused = gme.get_fused_embeddings(texts=texts, images=images)
print((e_fused[0] * e_fused[1]).sum())
## tensor(0.6108, dtype=torch.float16)
```
## Evaluation
We validated the performance on our universal multimodal retrieval benchmark (**UMRB**) among others.
| | | Single-modal | | Cross-modal | | | Fused-modal | | | | Avg. |
|--------------------|------|:------------:|:---------:|:-----------:|:-----------:|:---------:|:-----------:|:----------:|:----------:|:-----------:|:----------:|
| | | T→T (16) | I→I (1) | T→I (4) | T→VD (10) | I→T (4) | T→IT (2) | IT→T (5) | IT→I (2) | IT→IT (3) | (47) |
| VISTA | 0.2B | 55.15 | **31.98** | 32.88 | 10.12 | 31.23 | 45.81 | 53.32 | 8.97 | 26.26 | 36.74 |
| CLIP-SF | 0.4B | 39.75 | 31.42 | 59.05 | 24.09 | 62.95 | 66.41 | 53.32 | 34.9 | 55.65 | 43.24 |
| One-Peace | 4B | 43.54 | 31.27 | 61.38 | 42.9 | 65.59 | 42.72 | 28.29 | 6.73 | 23.41 | 42.03 |
| DSE | 4.2B | 48.94 | 27.92 | 40.75 | 78.21 | 52.54 | 49.62 | 35.44 | 8.36 | 40.18 | 50.63 |
| E5-V | 8.4B | 52.41 | 27.36 | 46.56 | 41.22 | 47.95 | 54.13 | 32.9 | 23.17 | 7.23 | 42.48 |
| **[GME-Qwen2-VL-2B](https://huggingface.co/Alibaba-NLP/gme-Qwen2-VL-2B-Instruct)** | 2.2B | 55.93 | 29.86 | 57.36 | 87.84 | 61.93 | 76.47 | 64.58 | 37.02 | 66.47 | 64.45 |
| **[GME-Qwen2-VL-7B](https://huggingface.co/Alibaba-NLP/gme-Qwen2-VL-7B-Instruct)** | 8.3B | **58.19** | 31.89 | **61.35** | **89.92** | **65.83** | **80.94** | **66.18** | **42.56** | **73.62** | **67.44** |
The [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) English tab shows the text embeddings performence of our model.
**More detailed experimental results can be found in the [paper](http://arxiv.org/abs/2412.16855)**.
## Limitations
- **Single Image Input**: In `Qwen2-VL`, an image could be converted into a very large number of visual tokens. We limit the number of visual tokens to 1024 to obtain a good training efficiency.
Due to the lack of relevant data, our models and evaluations retain one single image.
- **English-only Training**: Our models are trained on english data only. Although the `Qwen2-VL` models are multilingual, the multilingual-multimodal embedding performance are not guaranteed.
We will extend to multi-image input, image-text interleaved data as well as multilingual data in the future version.
## Redistribution and Use
We encourage and value diverse applications of GME models and continuous enhancements to the models themselves.
- If you distribute or make GME models (or any derivative works) available, or if you create a product or service (including another AI model) that incorporates them, you must prominently display `Built with GME` on your website, user interface, blog post, About page, or product documentation.
- If you utilize GME models or their outputs to develop, train, fine-tune, or improve an AI model that is distributed or made available, you must prefix the name of any such AI model with `GME`.
## Cloud API Services
In addition to the open-source [GME](https://huggingface.co/collections/Alibaba-NLP/gme-models-67667e092da3491f630964d6) series models, GME series models are also available as commercial API services on Alibaba Cloud.
- [MultiModal Embedding Models](https://help.aliyun.com/zh/model-studio/developer-reference/multimodal-embedding-api-reference?spm=a2c4g.11186623.0.0.321c1d1cqmoJ5C): The `multimodal-embedding-v1` model service is available.
Note that the models behind the commercial APIs are not entirely identical to the open-source models.
## Hiring
We have open positions for Research Interns and Full-Time Researchers to join our team at Tongyi Lab.
We are seeking passionate individuals with expertise in representation learning, LLM-driven information retrieval, Retrieval-Augmented Generation (RAG), and agent-based systems.
Our team is located in the vibrant cities of Beijing and Hangzhou, offering a collaborative and dynamic work environment where you can contribute to cutting-edge advancements in artificial intelligence and machine learning.
If you are driven by curiosity and eager to make a meaningful impact through your work, we would love to hear from you. Please submit your resume along with a brief introduction to <a href="mailto:dingkun.ldk@alibaba-inc.com">dingkun.ldk@alibaba-inc.com</a>.
## Citation
If you find our paper or models helpful, please consider cite:
```
@misc{zhang2024gme,
title={GME: Improving Universal Multimodal Retrieval by Multimodal LLMs},
author={Zhang, Xin and Zhang, Yanzhao and Xie, Wen and Li, Mingxin and Dai, Ziqi and Long, Dingkun and Xie, Pengjun and Zhang, Meishan and Li, Wenjie and Zhang, Min},
year={2024},
eprint={2412.16855},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={http://arxiv.org/abs/2412.16855},
}
```