antoinelouis
/

biencoder-camembert-L8-mmarcoFR

Sentence Similarity

sentence-transformers

passage-retrieval

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

antoinelouis commited on Mar 22, 2024

Commit

09bf4eb

·

verified ·

1 Parent(s): 48759d1

Update README.md

Files changed (1) hide show

README.md +34 -15

README.md CHANGED Viewed

@@ -7,27 +7,46 @@ datasets:
 metrics:
 - recall
 tags:
-- feature-extraction
-- sentence-similarity
 library_name: sentence-transformers
 ---
-<h1 align="center">biencoder-camembert-L8-mmarcoFR</h1>
-<h4 align="center">
-  <p>
-      <a href=#usage>🛠️ Usage</a>  |
-      <a href="#evaluation">📊 Evaluation</a> |
-      <a href="#train">🤖 Training</a> |
-      <a href="#citation">🔗 Citation</a>
-  <p>
-</h4>
-This is a [sentence-transformers](https://www.SBERT.net) model. It maps questions and paragraphs 768-dimensional dense vectors and should be used for semantic search.
 The model uses an [CamemBERT-L8](https://huggingface.co/antoinelouis/camembert-L8) backbone, which is a pruned version of the pre-trained [CamemBERT](https://huggingface.co/camembert-base)
 checkpoint with 26% less parameters, obtained by [dropping the top-layers](https://doi.org/10.48550/arXiv.2004.03844) from the original model.
-The model was trained on the **French** portion of the [mMARCO](https://huggingface.co/datasets/unicamp-dl/mmarco) retrieval dataset.
 ## Usage

 metrics:
 - recall
 tags:
+- passage-retrieval
 library_name: sentence-transformers
+base_model: antoinelouis/camembert-L8
+model-index:
+- name: biencoder-camembert-L8-mmarcoFR
+  results:
+    - task:
+        type: sentence-similarity
+        name: Passage Retrieval
+      dataset:
+        type: unicamp-dl/mmarco
+        name: mMARCO-fr
+        config: french
+        split: validation
+      metrics:
+        - type: recall_at_500
+          name: Recall@500
+          value: 87.4
+        - type: recall_at_100
+          name: Recall@100
+          value: 75.9
+        - type: recall_at_10
+          name: Recall@10
+          value: 48.9
+        - type: mrr_at_10
+          name: MRR@10
+          value: 26.7
+        - type: ndcg_at_10
+          name: nDCG@10
+          value: 31.8
+        - type: map_at_10
+          name: MAP@10
+          value: 26.2
 ---
+# biencoder-camembert-L8-mmarcoFR
+This is a lightweight dense single-vector bi-encoder model for French. It maps questions and paragraphs 768-dimensional dense vectors and should be used for semantic search.
 The model uses an [CamemBERT-L8](https://huggingface.co/antoinelouis/camembert-L8) backbone, which is a pruned version of the pre-trained [CamemBERT](https://huggingface.co/camembert-base)
 checkpoint with 26% less parameters, obtained by [dropping the top-layers](https://doi.org/10.48550/arXiv.2004.03844) from the original model.
 ## Usage