Email Domain Adaptation (EN → TR)

This model is NOT intended as a general-purpose English–Turkish translation model.

It was trained exclusively for demonstrating domain adaptation and terminology biasing, following the methodology described in:

🤗 Hugging Face LLM Course — Chapter 7: Domain Adaptation

🎯 Purpose

The sole objective of this model is to demonstrate that:

Domain-specific terminology preferences can be learned via fine-tuning,
even with a very small, controlled parallel dataset.

Concretely, the model is trained to consistently translate: email → e-posta regardless of context.

This behavior is intentionally biased and should not be interpreted as an improvement in general translation quality.

🧪 Training Setup

Base model: Helsinki-NLP/opus-mt-en-trk
Fine-tuning method: Seq2Seq fine-tuning
Dataset: Synthetic, controlled parallel corpus
Training size: ~150 sentence pairs
Evaluation: sacreBLEU (for demonstration only)

The dataset was intentionally constructed so that:

email appears only on the English side
e-posta appears only on the Turkish side
alternative forms (mail, e-mail) are excluded

📊 Evaluation Notes

BLEU scores are intentionally high due to the controlled nature of the dataset. They should not be interpreted as real-world translation benchmarks.

The success criterion is qualitative:

Whether the model exhibits the desired terminology shift
Including on out-of-distribution examples

⚠️ Intended Use

✅ Educational
✅ Research on domain adaptation
❌ Production translation
❌ General-purpose EN–TR translation

🚀 Use this model

This model can be loaded using the 🤗 Transformers pipeline API.

Translation (English → Turkish)

from transformers import pipeline

translator = pipeline(
    "translation_en_to_tr",
    model="Talip7/email-domain-adapt-en-tr"
)

translator("I did not receive your email.")

📚 Reference

https://huggingface.co/learn/llm-course/chapter7/4

Downloads last month: 3

Safetensors

Model size

75.8M params

Tensor type

F32