Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,11 @@ tags:
|
|
12 |
|
13 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
14 |
|
15 |
-
|
|
|
|
|
|
|
|
|
16 |
|
17 |
## Usage (Sentence-Transformers)
|
18 |
|
@@ -75,7 +79,19 @@ print(sentence_embeddings)
|
|
75 |
|
76 |
## Evaluation Results
|
77 |
|
78 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
79 |
|
80 |
For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=WikiMedical_sent_biobert_multi)
|
81 |
|
|
|
12 |
|
13 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
|
14 |
|
15 |
+
WikiMedical_sent_biobert_multi is a multilingual variation of [nuvocare/WikiMedical_sent_biobert](https://huggingface.co/nuvocare/WikiMedical_sent_biobert) sentence-transformers.
|
16 |
+
It has been trained on the [nuvocare/Ted2020_en_es_fr_de_it_ca_pl_ru_nl](https://huggingface.co/datasets/nuvocare/Ted2020_en_es_fr_de_it_ca_pl_ru_nl) dataset.
|
17 |
+
|
18 |
+
It uses the [nuvocare/WikiMedical_sent_biobert](https://huggingface.co/nuvocare/WikiMedical_sent_biobert) as a teacher model and a 'xlm-roberta-base' as a student model.
|
19 |
+
The student model is trained according to the [sentence transformers documentation](https://github.com/UKPLab/sentence-transformers/blob/master/examples/training/multilingual/make_multilingual.py) to replicate embeddings across different languages.
|
20 |
|
21 |
## Usage (Sentence-Transformers)
|
22 |
|
|
|
79 |
|
80 |
## Evaluation Results
|
81 |
|
82 |
+
The model is evaluated across languages based on 2 evaluators : [MSE](https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/evaluation/MSEEvaluator.py) and [translation](https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/evaluation/TranslationEvaluator.py).
|
83 |
+
|
84 |
+
The following table summarized the results:
|
85 |
+
|
86 |
+
| Language | MSE (x100) | Translation (source to target)| Translation (target to source)|
|
87 |
+
|---------|---------|---------|---------|
|
88 |
+
| de | 10.39 | 0.70 | 0.69 |
|
89 |
+
| es | 9.9 | 0.75 | 0.74 |
|
90 |
+
| fr | 10.00 | 0.72 | 0.73 |
|
91 |
+
| it | 10.29 | 0.69 | 0.69 |
|
92 |
+
| nl | 10.34 | 0.70 | 0.70 |
|
93 |
+
| pl | 11.39 | 0.58 | 0.58 |
|
94 |
+
| ru | 11.18 | 0.59 | 0.59 |
|
95 |
|
96 |
For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name=WikiMedical_sent_biobert_multi)
|
97 |
|