nllb-200-distilled-600M_ru_en_finetuned_crystallography

This model is a fine-tuned version of facebook/nllb-200-distilled-600M trained on the ascolda/ru_en_Crystallography_and_Spectroscopy dataset. It achieves the following results on the evaluation set:

Loss: 0.5602
Bleu: 56.5855

Model description

The finetuned model yieled better performance on the machine translation task of domain-specific scientific articles related to the Crystallography and Spectroscopy domain.

Metrics used to describe the fine-tuning effect

Below is the comparison of the translation quality metrics for the original NLLB model and my finetuned version. Evaluation is focused on: (1) general translation quality, (2) quality of translation of specific terminology, and (3) uniformity of translation of domain-specific terms in different contexts.

(1) The general translation quality was evaluated using the Bleu metric.

(2) Term Success Rate. In the terminology success rate we compared the machine-translated terms with their dictionary equivalents by checking for the presence of the reference terminology translation in the output by the regular expression match.

(3) Term Consistency. This metric looks at whether technical terms are translated uniformly across the entire text corpus in different contexts. We aim for high consistency, measured by the low occurrence of multiple translations for the same term within the evaluation dataset.

Model	BLEU	Term Success Rate	Term Consistency
nllb-200-distilled-600M	38.19	0.246	0.199
nllb-200-distilled-600M_ru_en_finetuned_crystallography	56.59	0.573	0.740

ascolda
/

nllb-200-distilled-600M_ru_en_finetuned_crystallography

nllb-200-distilled-600M_ru_en_finetuned_crystallography

Model description

Metrics used to describe the fine-tuning effect

Dataset used to train ascolda/nllb-200-distilled-600M_ru_en_finetuned_crystallography