metadata
language:
- it
- en
base_model:
- Qwen/Qwen2.5-7B-Instruct
pipeline_tag: translation
Qwen-2.5 Fine-Tuned for English to Italian Translation
Overview
This model is a fine-tuned version of the Qwen-2.5-Instruct large language model, specifically adapted for English-to-Italian translation tasks. Fine-tuning was performed using the DGT-TM dataset, a high-quality bilingual corpus provided by the European Union. The fine-tuning process was facilitated by Unsloth, a lightweight and efficient training library for large language models. Model Details
Fine-Tuning Framework: Unsloth
Primary Task: English-to-Italian Translation
Dataset Used: DGT-TM (Directorate-General for Translation Translation Memory)
Language Pair: English → Italian
Dataset Description
The DGT-TM dataset is a collection of professionally translated texts from the European Union, providing high-quality bilingual data. It is particularly suited for domain-specific translation tasks, such as legal, technical, and administrative documents.
Size: 1.3 million sentence pairs
Domains: Legal, Technical, Administrative
Preprocessing: Sentences were tokenized and cleaned to ensure alignment consistency and to remove any low-quality pairs.