nllb-200-distilled-600M_ru_en_finetuned_crystallography

This model is a fine-tuned version of facebook/nllb-200-distilled-600M trained on the ascolda/ru_en_Crystallography_and_Spectroscopy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5602
  • Bleu: 56.5855

Model description

The finetuned model yieled better performance on the machine translation task of domain-specific scientific articles related to the Crystallography and Spectroscopy domain.

Metrics used to describe the fine-tuning effect

Below is the comparison of the translation quality metrics for the original NLLB model and my finetuned version. Evaluation is focused on: (1) general translation quality, (2) quality of translation of specific terminology, and (3) uniformity of translation of domain-specific terms in different contexts.

(1) The general translation quality was evaluated using the Bleu metric.

(2) Term Success Rate. In the terminology success rate we compared the machine-translated terms with their dictionary equivalents by checking for the presence of the reference terminology translation in the output by the regular expression match.

(3) Term Consistency. This metric looks at whether technical terms are translated uniformly across the entire text corpus in different contexts. We aim for high consistency, measured by the low occurrence of multiple translations for the same term within the evaluation dataset.

Model BLEU Term Success Rate Term Consistency
nllb-200-distilled-600M 38.19 0.246 0.199
nllb-200-distilled-600M_ru_en_finetuned_crystallography 56.59 0.573 0.740
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train ascolda/nllb-200-distilled-600M_ru_en_finetuned_crystallography