metadata

language:
  - it
  - en
base_model:
  - Qwen/Qwen2.5-7B-Instruct
pipeline_tag: translation

Qwen-2.5 Fine-Tuned for English to Italian Translation

Overview

This model is a fine-tuned version of the Qwen-2.5-Instruct large language model, specifically adapted for English-to-Italian translation tasks. Fine-tuning was performed using the DGT-TM dataset, a high-quality bilingual corpus provided by the European Union. The fine-tuning process was facilitated by Unsloth, a lightweight and efficient training library for large language models. Model Details

Fine-Tuning Framework: Unsloth
Primary Task: English-to-Italian Translation
Dataset Used: DGT-TM (Directorate-General for Translation Translation Memory)
Language Pair: English → Italian

Dataset Description

The DGT-TM dataset is a collection of professionally translated texts from the European Union, providing high-quality bilingual data. It is particularly suited for domain-specific translation tasks, such as legal, technical, and administrative documents.

Size: 1.3 million sentence pairs
Domains: Legal, Technical, Administrative
Preprocessing: Sentences were tokenized and cleaned to ensure alignment consistency and to remove any low-quality pairs.