PM-AI
/

sts_paraphrase_xlm-roberta-base_de-en

@@ -39,7 +39,7 @@ When the TSystems model was published, only the STSb dataset was used for STS tr
 Therefore it is included in our model, but expanded to include SICK and Priya22 semantic textual relatedness:
  - SICK was partly used in STSb, but our custom translation using [DeepL](https://www.deepl.com/) leads to slightly different phrases. This approach allows more examples to be included in the training.
  - The Priya22 semantic textual relatedness dataset published in 2022 was also translated into German via DeepL and added to the training data. Since it does not have a train-test-split, it was created independently at a ratio of 80:20.
-The rating scale of all datasets has been adjusted to STSb with a value range from 0 to 5.
 All training and test data (STSb, Sick, Priya22) were checked for duplicates within and with each other and removed if found.
 Because the test data is prioritized, duplicated entries between test-train are exclusively removed from train split.
 The final used datasets can be viewed here: [datasets_sts_paraphrase_xlm-roberta-base_de-en](https://gitlab.com/sense.ai.tion-public/datasets_sts_paraphrase_xlm-roberta-base_de-en)
@@ -74,7 +74,7 @@ In addition, the test samples used are evaluated individually for each data set
 This subdivision per data set allows for a fair overall assessment, since external models are not built on the same data basis as the model presented here.
 The data is not evenly distributed in either training or testing!
-**❗Some models are only usable for one language (because they are monolingual). They will almost not perform at all in the other two tables.**
 The first table shows the evaluation results for **cross-lingual (German-English-Mixed)** based on _Spearman_:
 **model**|**STSb**|**SICK**|**Priya22**|**all**|

 Therefore it is included in our model, but expanded to include SICK and Priya22 semantic textual relatedness:
  - SICK was partly used in STSb, but our custom translation using [DeepL](https://www.deepl.com/) leads to slightly different phrases. This approach allows more examples to be included in the training.
  - The Priya22 semantic textual relatedness dataset published in 2022 was also translated into German via DeepL and added to the training data. Since it does not have a train-test-split, it was created independently at a ratio of 80:20.
 All training and test data (STSb, Sick, Priya22) were checked for duplicates within and with each other and removed if found.
 Because the test data is prioritized, duplicated entries between test-train are exclusively removed from train split.
 The final used datasets can be viewed here: [datasets_sts_paraphrase_xlm-roberta-base_de-en](https://gitlab.com/sense.ai.tion-public/datasets_sts_paraphrase_xlm-roberta-base_de-en)
 This subdivision per data set allows for a fair overall assessment, since external models are not built on the same data basis as the model presented here.
 The data is not evenly distributed in either training or testing!
+**❗Some models are only usable for one language (because they are monolingual). They will almost not perform at all in the other two tables. Still, they are good models in certain applications❗**
 The first table shows the evaluation results for **cross-lingual (German-English-Mixed)** based on _Spearman_:
 **model**|**STSb**|**SICK**|**Priya22**|**all**|