antoinelouis
commited on
Commit
•
09bf4eb
1
Parent(s):
48759d1
Update README.md
Browse files
README.md
CHANGED
@@ -7,27 +7,46 @@ datasets:
|
|
7 |
metrics:
|
8 |
- recall
|
9 |
tags:
|
10 |
-
-
|
11 |
-
- sentence-similarity
|
12 |
library_name: sentence-transformers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
---
|
14 |
|
15 |
-
|
16 |
|
17 |
-
|
18 |
-
<h4 align="center">
|
19 |
-
<p>
|
20 |
-
<a href=#usage>🛠️ Usage</a> |
|
21 |
-
<a href="#evaluation">📊 Evaluation</a> |
|
22 |
-
<a href="#train">🤖 Training</a> |
|
23 |
-
<a href="#citation">🔗 Citation</a>
|
24 |
-
<p>
|
25 |
-
</h4>
|
26 |
-
|
27 |
-
This is a [sentence-transformers](https://www.SBERT.net) model. It maps questions and paragraphs 768-dimensional dense vectors and should be used for semantic search.
|
28 |
The model uses an [CamemBERT-L8](https://huggingface.co/antoinelouis/camembert-L8) backbone, which is a pruned version of the pre-trained [CamemBERT](https://huggingface.co/camembert-base)
|
29 |
checkpoint with 26% less parameters, obtained by [dropping the top-layers](https://doi.org/10.48550/arXiv.2004.03844) from the original model.
|
30 |
-
The model was trained on the **French** portion of the [mMARCO](https://huggingface.co/datasets/unicamp-dl/mmarco) retrieval dataset.
|
31 |
|
32 |
## Usage
|
33 |
|
|
|
7 |
metrics:
|
8 |
- recall
|
9 |
tags:
|
10 |
+
- passage-retrieval
|
|
|
11 |
library_name: sentence-transformers
|
12 |
+
base_model: antoinelouis/camembert-L8
|
13 |
+
model-index:
|
14 |
+
- name: biencoder-camembert-L8-mmarcoFR
|
15 |
+
results:
|
16 |
+
- task:
|
17 |
+
type: sentence-similarity
|
18 |
+
name: Passage Retrieval
|
19 |
+
dataset:
|
20 |
+
type: unicamp-dl/mmarco
|
21 |
+
name: mMARCO-fr
|
22 |
+
config: french
|
23 |
+
split: validation
|
24 |
+
metrics:
|
25 |
+
- type: recall_at_500
|
26 |
+
name: Recall@500
|
27 |
+
value: 87.4
|
28 |
+
- type: recall_at_100
|
29 |
+
name: Recall@100
|
30 |
+
value: 75.9
|
31 |
+
- type: recall_at_10
|
32 |
+
name: Recall@10
|
33 |
+
value: 48.9
|
34 |
+
- type: mrr_at_10
|
35 |
+
name: MRR@10
|
36 |
+
value: 26.7
|
37 |
+
- type: ndcg_at_10
|
38 |
+
name: nDCG@10
|
39 |
+
value: 31.8
|
40 |
+
- type: map_at_10
|
41 |
+
name: MAP@10
|
42 |
+
value: 26.2
|
43 |
---
|
44 |
|
45 |
+
# biencoder-camembert-L8-mmarcoFR
|
46 |
|
47 |
+
This is a lightweight dense single-vector bi-encoder model for French. It maps questions and paragraphs 768-dimensional dense vectors and should be used for semantic search.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
The model uses an [CamemBERT-L8](https://huggingface.co/antoinelouis/camembert-L8) backbone, which is a pruned version of the pre-trained [CamemBERT](https://huggingface.co/camembert-base)
|
49 |
checkpoint with 26% less parameters, obtained by [dropping the top-layers](https://doi.org/10.48550/arXiv.2004.03844) from the original model.
|
|
|
50 |
|
51 |
## Usage
|
52 |
|