cointegrated
commited on
Commit
•
76dcef5
1
Parent(s):
e3cbd20
Update README.md
Browse files
README.md
CHANGED
@@ -12,10 +12,12 @@ datasets:
|
|
12 |
---
|
13 |
|
14 |
This is a [ruBERT-conversational](https://huggingface.co/DeepPavlov/rubert-base-cased-conversational) model trained on the mixture of 3 paraphrase detection datasets:
|
15 |
-
- [ru_paraphraser](https://huggingface.co/merionum/ru_paraphraser)
|
16 |
- [RuPAWS](https://github.com/ivkrotova/rupaws_dataset)
|
17 |
- A dataset containing crowdsourced evaluation of content preservation in Russian text detoxification by [Dementieva et al, 2022](https://www.dialog-21.ru/media/5755/dementievadplusetal105.pdf).
|
18 |
|
|
|
|
|
19 |
Training notebook: `task_oriented_TST/similarity/cross_encoders/russian/train_russian_paraphrase_detector__fixed.ipynb` (in a private repo).
|
20 |
|
21 |
Training parameters:
|
|
|
12 |
---
|
13 |
|
14 |
This is a [ruBERT-conversational](https://huggingface.co/DeepPavlov/rubert-base-cased-conversational) model trained on the mixture of 3 paraphrase detection datasets:
|
15 |
+
- [ru_paraphraser](https://huggingface.co/merionum/ru_paraphraser) (with classes -1 and 0 merged)
|
16 |
- [RuPAWS](https://github.com/ivkrotova/rupaws_dataset)
|
17 |
- A dataset containing crowdsourced evaluation of content preservation in Russian text detoxification by [Dementieva et al, 2022](https://www.dialog-21.ru/media/5755/dementievadplusetal105.pdf).
|
18 |
|
19 |
+
The model can be used to assess semantic similarity of Russian sentences.
|
20 |
+
|
21 |
Training notebook: `task_oriented_TST/similarity/cross_encoders/russian/train_russian_paraphrase_detector__fixed.ipynb` (in a private repo).
|
22 |
|
23 |
Training parameters:
|