Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ metrics:
|
|
17 |
- spearmanr
|
18 |
---
|
19 |
## Model Description:
|
20 |
-
[**vietnamese-embedding-LongContext**](https://huggingface.co/dangvantuan/vietnamese-embedding-LongContext) is the Embedding Model for Vietnamese language with context length up to 8096 tokens. This model is a specialized
|
21 |
|
22 |
## Full Model Architecture
|
23 |
```
|
@@ -100,7 +100,7 @@ test_evaluator(model, output_path="./")
|
|
100 |
**Spearman score**
|
101 |
| Model | [STSB] | [STS12]| [STS13] | [STS14] | [STS15] | [STS16] | [SICK] | Mean |
|
102 |
|-----------------------------------------------------------|---------|----------|----------|----------|----------|----------|---------|--------|
|
103 |
-
| [dangvantuan/vietnamese-embedding](https://huggingface.co/dangvantuan/vietnamese-embedding)
|
104 |
| [dangvantuan/vietnamese-embedding-LongContext](https://huggingface.co/dangvantuan/vietnamese-embedding-LongContext) |85.25| 75.77| 83.82| 81.69| 88.48| 81.5| 78.2| 82.10|
|
105 |
|
106 |
## Citation
|
@@ -120,7 +120,7 @@ test_evaluator(model, output_path="./")
|
|
120 |
journal={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
|
121 |
year={2020}
|
122 |
}
|
123 |
-
|
124 |
title={Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks},
|
125 |
author={Thakur, Nandan and Reimers, Nils and Daxenberger, Johannes and Gurevych, Iryna},
|
126 |
journal={arXiv e-prints},
|
|
|
17 |
- spearmanr
|
18 |
---
|
19 |
## Model Description:
|
20 |
+
[**vietnamese-embedding-LongContext**](https://huggingface.co/dangvantuan/vietnamese-embedding-LongContext) is the Embedding Model for Vietnamese language with context length up to 8096 tokens. This model is a specialized text-embedding trained specifically for the Vietnamese language, which is built upon [gte-multilingual](Alibaba-NLP/gte-multilingual-base) and trained using the Multi-Negative Ranking Loss, Matryoshka2dLoss and SimilarityLoss.
|
21 |
|
22 |
## Full Model Architecture
|
23 |
```
|
|
|
100 |
**Spearman score**
|
101 |
| Model | [STSB] | [STS12]| [STS13] | [STS14] | [STS15] | [STS16] | [SICK] | Mean |
|
102 |
|-----------------------------------------------------------|---------|----------|----------|----------|----------|----------|---------|--------|
|
103 |
+
| [dangvantuan/vietnamese-embedding](https://huggingface.co/dangvantuan/vietnamese-embedding) |84.84| 79.04| 85.30| 81.38| 87.06| 79.95| 79.58| 82.45|
|
104 |
| [dangvantuan/vietnamese-embedding-LongContext](https://huggingface.co/dangvantuan/vietnamese-embedding-LongContext) |85.25| 75.77| 83.82| 81.69| 88.48| 81.5| 78.2| 82.10|
|
105 |
|
106 |
## Citation
|
|
|
120 |
journal={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
|
121 |
year={2020}
|
122 |
}
|
123 |
+
@article{thakur2020augmented,
|
124 |
title={Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks},
|
125 |
author={Thakur, Nandan and Reimers, Nils and Daxenberger, Johannes and Gurevych, Iryna},
|
126 |
journal={arXiv e-prints},
|