netapy's picture
Adding eval results on mMARCO-fr (#3)
9f6465f verified
|
raw
history blame
20.4 kB
---
tags:
- mteb
model-index:
- name: Solon-embeddings-large-0.1
results:
- task:
type: sentence-similarity
name: Passage Retrieval
dataset:
type: unicamp-dl/mmarco
name: mMARCO-fr
config: french
split: validation
metrics:
- type: recall_at_500
name: Recall@500
value: 92.7
- type: recall_at_100
name: Recall@100
value: 82.7
- type: recall_at_10
name: Recall@10
value: 55.5
- type: map_at_10
name: MAP@10
value: 29.4
- type: ndcg_at_10
name: nDCG@10
value: 35.8
- type: mrr_at_10
name: MRR@10
value: 29.9
- task:
type: Clustering
dataset:
type: lyon-nlp/alloprof
name: MTEB AlloProfClusteringP2P
config: default
split: test
revision: 392ba3f5bcc8c51f578786c1fc3dae648662cb9b
metrics:
- type: v_measure
value: 64.16942168287153
- task:
type: Clustering
dataset:
type: lyon-nlp/alloprof
name: MTEB AlloProfClusteringS2S
config: default
split: test
revision: 392ba3f5bcc8c51f578786c1fc3dae648662cb9b
metrics:
- type: v_measure
value: 38.17076313383054
- task:
type: Reranking
dataset:
type: lyon-nlp/mteb-fr-reranking-alloprof-s2p
name: MTEB AlloprofReranking
config: default
split: test
revision: 666fdacebe0291776e86f29345663dfaf80a0db9
metrics:
- type: map
value: 64.8770878097632
- type: mrr
value: 66.39132423169396
- task:
type: Retrieval
dataset:
type: lyon-nlp/alloprof
name: MTEB AlloprofRetrieval
config: default
split: test
revision: 392ba3f5bcc8c51f578786c1fc3dae648662cb9b
metrics:
- type: map_at_1
value: 29.62
- type: map_at_10
value: 40.963
- type: map_at_100
value: 41.894
- type: map_at_1000
value: 41.939
- type: map_at_3
value: 37.708999999999996
- type: map_at_5
value: 39.696999999999996
- type: mrr_at_1
value: 29.62
- type: mrr_at_10
value: 40.963
- type: mrr_at_100
value: 41.894
- type: mrr_at_1000
value: 41.939
- type: mrr_at_3
value: 37.708999999999996
- type: mrr_at_5
value: 39.696999999999996
- type: ndcg_at_1
value: 29.62
- type: ndcg_at_10
value: 46.942
- type: ndcg_at_100
value: 51.629999999999995
- type: ndcg_at_1000
value: 52.927
- type: ndcg_at_3
value: 40.333999999999996
- type: ndcg_at_5
value: 43.922
- type: precision_at_1
value: 29.62
- type: precision_at_10
value: 6.589
- type: precision_at_100
value: 0.882
- type: precision_at_1000
value: 0.099
- type: precision_at_3
value: 15.976
- type: precision_at_5
value: 11.33
- type: recall_at_1
value: 29.62
- type: recall_at_10
value: 65.889
- type: recall_at_100
value: 88.212
- type: recall_at_1000
value: 98.575
- type: recall_at_3
value: 47.927
- type: recall_at_5
value: 56.64900000000001
- task:
type: Classification
dataset:
type: mteb/amazon_reviews_multi
name: MTEB AmazonReviewsClassification (fr)
config: fr
split: test
revision: 1399c76144fd37290681b995c656ef9b2e06e26d
metrics:
- type: accuracy
value: 42.077999999999996
- type: f1
value: 40.64511241732637
- task:
type: Retrieval
dataset:
type: maastrichtlawtech/bsard
name: MTEB BSARDRetrieval
config: default
split: test
revision: 5effa1b9b5fa3b0f9e12523e6e43e5f86a6e6d59
metrics:
- type: map_at_1
value: 0.901
- type: map_at_10
value: 1.524
- type: map_at_100
value: 1.833
- type: map_at_1000
value: 1.916
- type: map_at_3
value: 1.276
- type: map_at_5
value: 1.276
- type: mrr_at_1
value: 0.901
- type: mrr_at_10
value: 1.524
- type: mrr_at_100
value: 1.833
- type: mrr_at_1000
value: 1.916
- type: mrr_at_3
value: 1.276
- type: mrr_at_5
value: 1.276
- type: ndcg_at_1
value: 0.901
- type: ndcg_at_10
value: 2.085
- type: ndcg_at_100
value: 3.805
- type: ndcg_at_1000
value: 6.704000000000001
- type: ndcg_at_3
value: 1.41
- type: ndcg_at_5
value: 1.41
- type: precision_at_1
value: 0.901
- type: precision_at_10
value: 0.40499999999999997
- type: precision_at_100
value: 0.126
- type: precision_at_1000
value: 0.037
- type: precision_at_3
value: 0.601
- type: precision_at_5
value: 0.36
- type: recall_at_1
value: 0.901
- type: recall_at_10
value: 4.054
- type: recall_at_100
value: 12.613
- type: recall_at_1000
value: 36.937
- type: recall_at_3
value: 1.802
- type: recall_at_5
value: 1.802
- task:
type: BitextMining
dataset:
type: rbawden/DiaBLa
name: MTEB DiaBLaBitextMining (fr-en)
config: fr-en
split: test
revision: 5345895c56a601afe1a98519ce3199be60a27dba
metrics:
- type: accuracy
value: 88.90048712595686
- type: f1
value: 86.94952864886115
- type: precision
value: 86.20344379175826
- type: recall
value: 88.90048712595686
- task:
type: Clustering
dataset:
type: lyon-nlp/clustering-hal-s2s
name: MTEB HALClusteringS2S
config: default
split: test
revision: e06ebbbb123f8144bef1a5d18796f3dec9ae2915
metrics:
- type: v_measure
value: 24.087988843991155
- task:
type: Clustering
dataset:
type: mlsum
name: MTEB MLSUMClusteringP2P
config: default
split: test
revision: b5d54f8f3b61ae17845046286940f03c6bc79bc7
metrics:
- type: v_measure
value: 43.79603865728535
- task:
type: Clustering
dataset:
type: mlsum
name: MTEB MLSUMClusteringS2S
config: default
split: test
revision: b5d54f8f3b61ae17845046286940f03c6bc79bc7
metrics:
- type: v_measure
value: 37.746550373003
- task:
type: Classification
dataset:
type: mteb/mtop_domain
name: MTEB MTOPDomainClassification (fr)
config: fr
split: test
revision: d80d48c1eb48d3562165c59d59d0034df9fff0bf
metrics:
- type: accuracy
value: 89.26088318196052
- type: f1
value: 88.95811185929033
- task:
type: Classification
dataset:
type: mteb/mtop_intent
name: MTEB MTOPIntentClassification (fr)
config: fr
split: test
revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba
metrics:
- type: accuracy
value: 68.55308487316003
- type: f1
value: 48.2936682439785
- task:
type: Classification
dataset:
type: masakhane/masakhanews
name: MTEB MasakhaNEWSClassification (fra)
config: fra
split: test
revision: 8ccc72e69e65f40c70e117d8b3c08306bb788b60
metrics:
- type: accuracy
value: 81.51658767772511
- type: f1
value: 77.695234448912
- task:
type: Clustering
dataset:
type: masakhane/masakhanews
name: MTEB MasakhaNEWSClusteringP2P (fra)
config: fra
split: test
revision: 8ccc72e69e65f40c70e117d8b3c08306bb788b60
metrics:
- type: v_measure
value: 40.80377094681114
- task:
type: Clustering
dataset:
type: masakhane/masakhanews
name: MTEB MasakhaNEWSClusteringS2S (fra)
config: fra
split: test
revision: 8ccc72e69e65f40c70e117d8b3c08306bb788b60
metrics:
- type: v_measure
value: 28.79703837416241
- task:
type: Classification
dataset:
type: mteb/amazon_massive_intent
name: MTEB MassiveIntentClassification (fr)
config: fr
split: test
revision: 31efe3c427b0bae9c22cbb560b8f15491cc6bed7
metrics:
- type: accuracy
value: 67.40080699394755
- type: f1
value: 65.60793135686376
- task:
type: Classification
dataset:
type: mteb/amazon_massive_scenario
name: MTEB MassiveScenarioClassification (fr)
config: fr
split: test
revision: 7d571f92784cd94a019292a1f45445077d0ef634
metrics:
- type: accuracy
value: 71.29455279085406
- type: f1
value: 70.80876673828983
- task:
type: Retrieval
dataset:
type: jinaai/mintakaqa
name: MTEB MintakaRetrieval (fr)
config: fr
split: test
revision: efa78cc2f74bbcd21eff2261f9e13aebe40b814e
metrics:
- type: map_at_1
value: 16.625999999999998
- type: map_at_10
value: 25.224999999999998
- type: map_at_100
value: 26.291999999999998
- type: map_at_1000
value: 26.395000000000003
- type: map_at_3
value: 22.378999999999998
- type: map_at_5
value: 24.009
- type: mrr_at_1
value: 16.625999999999998
- type: mrr_at_10
value: 25.224999999999998
- type: mrr_at_100
value: 26.291999999999998
- type: mrr_at_1000
value: 26.395000000000003
- type: mrr_at_3
value: 22.378999999999998
- type: mrr_at_5
value: 24.009
- type: ndcg_at_1
value: 16.625999999999998
- type: ndcg_at_10
value: 30.074
- type: ndcg_at_100
value: 35.683
- type: ndcg_at_1000
value: 38.714999999999996
- type: ndcg_at_3
value: 24.188000000000002
- type: ndcg_at_5
value: 27.124
- type: precision_at_1
value: 16.625999999999998
- type: precision_at_10
value: 4.566
- type: precision_at_100
value: 0.729
- type: precision_at_1000
value: 0.097
- type: precision_at_3
value: 9.801
- type: precision_at_5
value: 7.305000000000001
- type: recall_at_1
value: 16.625999999999998
- type: recall_at_10
value: 45.659
- type: recall_at_100
value: 72.85000000000001
- type: recall_at_1000
value: 97.42
- type: recall_at_3
value: 29.402
- type: recall_at_5
value: 36.527
- task:
type: PairClassification
dataset:
type: GEM/opusparcus
name: MTEB OpusparcusPC (fr)
config: fr
split: test
revision: 9e9b1f8ef51616073f47f306f7f47dd91663f86a
metrics:
- type: cos_sim_accuracy
value: 83.58310626702998
- type: cos_sim_ap
value: 94.01979957812989
- type: cos_sim_f1
value: 88.70135958743555
- type: cos_sim_precision
value: 84.01420959147424
- type: cos_sim_recall
value: 93.94240317775571
- type: dot_accuracy
value: 83.58310626702998
- type: dot_ap
value: 94.01979957812989
- type: dot_f1
value: 88.70135958743555
- type: dot_precision
value: 84.01420959147424
- type: dot_recall
value: 93.94240317775571
- type: euclidean_accuracy
value: 83.58310626702998
- type: euclidean_ap
value: 94.01979957812989
- type: euclidean_f1
value: 88.70135958743555
- type: euclidean_precision
value: 84.01420959147424
- type: euclidean_recall
value: 93.94240317775571
- type: manhattan_accuracy
value: 83.58310626702998
- type: manhattan_ap
value: 93.99936024003892
- type: manhattan_f1
value: 88.6924150767799
- type: manhattan_precision
value: 83.45008756567425
- type: manhattan_recall
value: 94.63753723932473
- type: max_accuracy
value: 83.58310626702998
- type: max_ap
value: 94.01979957812989
- type: max_f1
value: 88.70135958743555
- task:
type: PairClassification
dataset:
type: paws-x
name: MTEB PawsX (fr)
config: fr
split: test
revision: 8a04d940a42cd40658986fdd8e3da561533a3646
metrics:
- type: cos_sim_accuracy
value: 60.6
- type: cos_sim_ap
value: 60.18915797975459
- type: cos_sim_f1
value: 62.491349480968864
- type: cos_sim_precision
value: 45.44539506794162
- type: cos_sim_recall
value: 100
- type: dot_accuracy
value: 60.6
- type: dot_ap
value: 60.091135216056024
- type: dot_f1
value: 62.491349480968864
- type: dot_precision
value: 45.44539506794162
- type: dot_recall
value: 100
- type: euclidean_accuracy
value: 60.6
- type: euclidean_ap
value: 60.18915797975459
- type: euclidean_f1
value: 62.491349480968864
- type: euclidean_precision
value: 45.44539506794162
- type: euclidean_recall
value: 100
- type: manhattan_accuracy
value: 60.650000000000006
- type: manhattan_ap
value: 60.2082343915352
- type: manhattan_f1
value: 62.491349480968864
- type: manhattan_precision
value: 45.44539506794162
- type: manhattan_recall
value: 100
- type: max_accuracy
value: 60.650000000000006
- type: max_ap
value: 60.2082343915352
- type: max_f1
value: 62.491349480968864
- task:
type: STS
dataset:
type: Lajavaness/SICK-fr
name: MTEB SICKFr
config: default
split: test
revision: e077ab4cf4774a1e36d86d593b150422fafd8e8a
metrics:
- type: cos_sim_pearson
value: 79.77067200230256
- type: cos_sim_spearman
value: 76.7445532523278
- type: euclidean_pearson
value: 76.34017074673956
- type: euclidean_spearman
value: 76.7453011027832
- type: manhattan_pearson
value: 76.19578084197778
- type: manhattan_spearman
value: 76.56293456459228
- task:
type: STS
dataset:
type: mteb/sts22-crosslingual-sts
name: MTEB STS22 (fr)
config: fr
split: test
revision: eea2b4fe26a775864c896887d910b76a8098ad3f
metrics:
- type: cos_sim_pearson
value: 81.2564160237984
- type: cos_sim_spearman
value: 83.30552085410882
- type: euclidean_pearson
value: 82.00494560507786
- type: euclidean_spearman
value: 83.30552085410882
- type: manhattan_pearson
value: 81.93132229157803
- type: manhattan_spearman
value: 83.04357992939353
- task:
type: STS
dataset:
type: stsb_multi_mt
name: MTEB STSBenchmarkMultilingualSTS (fr)
config: fr
split: test
revision: 93d57ef91790589e3ce9c365164337a8a78b7632
metrics:
- type: cos_sim_pearson
value: 80.34931905288978
- type: cos_sim_spearman
value: 79.99372771100049
- type: euclidean_pearson
value: 78.37976845123443
- type: euclidean_spearman
value: 79.99452356550658
- type: manhattan_pearson
value: 78.24434042082316
- type: manhattan_spearman
value: 79.87248340061164
- task:
type: Summarization
dataset:
type: lyon-nlp/summarization-summeval-fr-p2p
name: MTEB SummEvalFr
config: default
split: test
revision: b385812de6a9577b6f4d0f88c6a6e35395a94054
metrics:
- type: cos_sim_pearson
value: 30.476001473421586
- type: cos_sim_spearman
value: 29.687350195905456
- type: dot_pearson
value: 30.476000875190685
- type: dot_spearman
value: 29.662224660056562
- task:
type: Reranking
dataset:
type: lyon-nlp/mteb-fr-reranking-syntec-s2p
name: MTEB SyntecReranking
config: default
split: test
revision: b205c5084a0934ce8af14338bf03feb19499c84d
metrics:
- type: map
value: 88.28333333333333
- type: mrr
value: 88.28333333333333
- task:
type: Retrieval
dataset:
type: lyon-nlp/mteb-fr-retrieval-syntec-s2p
name: MTEB SyntecRetrieval
config: default
split: test
revision: 77f7e271bf4a92b24fce5119f3486b583ca016ff
metrics:
- type: map_at_1
value: 69
- type: map_at_10
value: 79.906
- type: map_at_100
value: 79.982
- type: map_at_1000
value: 79.982
- type: map_at_3
value: 77.667
- type: map_at_5
value: 79.51700000000001
- type: mrr_at_1
value: 69
- type: mrr_at_10
value: 79.906
- type: mrr_at_100
value: 79.982
- type: mrr_at_1000
value: 79.982
- type: mrr_at_3
value: 77.667
- type: mrr_at_5
value: 79.51700000000001
- type: ndcg_at_1
value: 69
- type: ndcg_at_10
value: 84.60499999999999
- type: ndcg_at_100
value: 84.868
- type: ndcg_at_1000
value: 84.868
- type: ndcg_at_3
value: 80.333
- type: ndcg_at_5
value: 83.647
- type: precision_at_1
value: 69
- type: precision_at_10
value: 9.9
- type: precision_at_100
value: 1
- type: precision_at_1000
value: 0.1
- type: precision_at_3
value: 29.333
- type: precision_at_5
value: 19.2
- type: recall_at_1
value: 69
- type: recall_at_10
value: 99
- type: recall_at_100
value: 100
- type: recall_at_1000
value: 100
- type: recall_at_3
value: 88
- type: recall_at_5
value: 96
- task:
type: Retrieval
dataset:
type: jinaai/xpqa
name: MTEB XPQARetrieval (fr)
config: fr
split: test
revision: c99d599f0a6ab9b85b065da6f9d94f9cf731679f
metrics:
- type: map_at_1
value: 42.027
- type: map_at_10
value: 64.331
- type: map_at_100
value: 65.657
- type: map_at_1000
value: 65.7
- type: map_at_3
value: 57.967999999999996
- type: map_at_5
value: 62.33800000000001
- type: mrr_at_1
value: 65.688
- type: mrr_at_10
value: 72.263
- type: mrr_at_100
value: 72.679
- type: mrr_at_1000
value: 72.69099999999999
- type: mrr_at_3
value: 70.405
- type: mrr_at_5
value: 71.587
- type: ndcg_at_1
value: 65.688
- type: ndcg_at_10
value: 70.221
- type: ndcg_at_100
value: 74.457
- type: ndcg_at_1000
value: 75.178
- type: ndcg_at_3
value: 65.423
- type: ndcg_at_5
value: 67.05499999999999
- type: precision_at_1
value: 65.688
- type: precision_at_10
value: 16.208
- type: precision_at_100
value: 1.975
- type: precision_at_1000
value: 0.207
- type: precision_at_3
value: 39.831
- type: precision_at_5
value: 28.652
- type: recall_at_1
value: 42.027
- type: recall_at_10
value: 78.803
- type: recall_at_100
value: 95.051
- type: recall_at_1000
value: 99.75500000000001
- type: recall_at_3
value: 62.62799999999999
- type: recall_at_5
value: 70.975
license: mit
language:
- fr
---
# Solon Embeddings — large 0.1
SOTA Open source french embedding model.
**Instructions :**
Add "query : " before the *query* to retrieve to increase performance of retrieval.
No instructions needed for *passages*.
| Model | Mean Score |
| --- | --- |
| **OrdalieTech/Solon-embeddings-large-0.1** | 0.7490 |
| cohere/embed-multilingual-v3 | 0.7402 |
| **OrdalieTech/Solon-embeddings-base-0.1** | 0.7306 |
| openai/ada-002 | 0.7290 |
| cohere/embed-multilingual-light-v3 | 0.6945 |
| antoinelouis/biencoder-camembert-base-mmarcoFR | 0.6826 |
| dangvantuan/sentence-camembert-large | 0.6756 |
| voyage/voyage-01 | 0.6753 |
| intfloat/multilingual-e5-large | 0.6660 |
| intfloat/multilingual-e5-base | 0.6597 |
| Sbert/paraphrase-multilingual-mpnet-base-v2 | 0.5975 |
| dangvantuan/sentence-camembert-base | 0.5456 |
| EuropeanParliament/eubert_embedding_v1 | 0.5063 |
These results have been obtained through 9 french benchmarks on a variety of text similarity tasks (classification, reranking, STS) :
- AmazonReviewsClassification (MTEB)
- MassiveIntentClassification (MTEB)
- MassiveScenarioClassification (MTEB)
- MTOPDomainClassification (MTEB)
- MTOPIntentClassification (MTEB)
- STS22 (MTEB)
- MiraclFRRerank (Miracl)
- OrdalieFRSTS (Ordalie)
- OrdalieFRReranking (Ordalie)
We created OrdalieFRSTS and OrdalieFRReranking to enhance the benchmarking capabilities of French STS and reranking assessments.
(evaluation script available here : github.com/OrdalieTech/mteb)