ascolda
/

nllb-200-distilled-600M_ru_en_finetuned_crystallography

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

ascolda commited on Jan 14

Commit

1035d20

•

1 Parent(s): 1352967

Create README.md

Files changed (1) hide show

README.md +39 -0

README.md ADDED Viewed

	@@ -0,0 +1,39 @@

+---
+datasets:
+- ascolda/ru_en_Crystallography_and_Spectroscopy
+language:
+- ru
+- en
+metrics:
+- bleu
+pipeline_tag: translation
+tags:
+- chemistry
+---
+# nllb-200-distilled-600M_ru_en_finetuned_crystallography
+This model is a fine-tuned version of facebook/nllb-200-distilled-600M trained on the ascolda/ru_en_Crystallography_and_Spectroscopy dataset
+It achieves the following results on the evaluation set:
+- Loss: 0.5602
+- Bleu: 56.5855
+## Model description
+The finetuned model yieled better performance on the machine translation task of domain-specific scientific articles related to the Crystallography and Spectroscopy domain.
+## Metrics used to describe the fine-tuning effect
+Below is the comparison of the translation quality metrics for the original NLLB model and my finetuned version. Evaluation is focused on: (1) general translation quality, (2) quality of translation of specific
+terminology, and (3) uniformity of translation of domain-specific terms in different contexts.
+(1) The general translation quality was evaluated using the Bleu metric.
+(2) Term Success Rate. In the terminology success rate we compared the machine-translated terms with their dictionary equivalents by checking for the presence of the reference terminology translation in the output by the regular expression match.
+(3) Term Consistency. This metric looks at whether technical terms are translated uniformly across the entire text corpus in different contexts. We aim for high consistency,
+measured by the low occurrence of multiple translations for the same term within the evaluation dataset.
+| Model                                                          | BLEU    | Term Success Rate   | Term Consistency |
+|:--------------------------------------------------------------:|:-------:|:-------------------:|:----------------:|
+| nllb-200-distilled-600M                                        | 38.19   | 0.246               | 0.199            |
+| nllb-200-distilled-600M_ru_en_finetuned_crystallography        | 56.59   | 0.573               | 0.740            |