--- pipeline_tag: sentence-similarity tags: - feature-extraction license: mit language: - fr - en model-index: - name: Solon-embeddings-base-0.1 results: - task: type: sentence-similarity name: Passage Retrieval dataset: type: unicamp-dl/mmarco name: mMARCO-fr config: french split: validation metrics: - type: recall_at_500 name: Recall@500 value: 90.9 - type: recall_at_100 name: Recall@100 value: 80.6 - type: recall_at_10 name: Recall@10 value: 52.5 - type: map_at_10 name: MAP@10 value: 27.4 - type: ndcg_at_10 name: nDCG@10 value: 33.5 - type: mrr_at_10 name: MRR@10 value: 27.9 --- # Solon Embeddings — Base 0.1 SOTA Open source french embedding model. **Instructions :** Add "query : " before the *query* to retrieve to increase performance of retrieval. No instructions needed for *passages*. | Model | Mean Score | | --- | --- | | **OrdalieTech/Solon-embeddings-large-0.1** | 0.7490 | | cohere/embed-multilingual-v3 | 0.7402 | | **OrdalieTech/Solon-embeddings-base-0.1** | 0.7306 | | openai/ada-002 | 0.7290 | | cohere/embed-multilingual-light-v3 | 0.6945 | | antoinelouis/biencoder-camembert-base-mmarcoFR | 0.6826 | | dangvantuan/sentence-camembert-large | 0.6756 | | voyage/voyage-01 | 0.6753 | | intfloat/multilingual-e5-large | 0.6660 | | intfloat/multilingual-e5-base | 0.6597 | | Sbert/paraphrase-multilingual-mpnet-base-v2 | 0.5975 | | dangvantuan/sentence-camembert-base | 0.5456 | | EuropeanParliament/eubert_embedding_v1 | 0.5063 | These results have been obtained through 9 french benchmarks on a variety of text similarity tasks (classification, reranking, STS) : - AmazonReviewsClassification (MTEB) - MassiveIntentClassification (MTEB) - MassiveScenarioClassification (MTEB) - MTOPDomainClassification (MTEB) - MTOPIntentClassification (MTEB) - STS22 (MTEB) - MiraclFRRerank (Miracl) - OrdalieFRSTS (Ordalie) - OrdalieFRReranking (Ordalie) We created OrdalieFRSTS and OrdalieFRReranking to enhance the benchmarking capabilities of French STS and reranking assessments. (evaluation script available here : github.com/OrdalieTech/mteb)