A newer version of this model is available: aktheroy/FT_Translate_en_el_hi

Model Card for aktheroy/FT_Translate_en_el_hi

This model is a fine-tuned version of facebook/m2m100_418M, designed for multilingual translation tasks between English (en), Greek (el), and Hindi (hi). The model achieves efficient translation by leveraging the M2M100 architecture, which supports many-to-many language translation.

Model Details

Model Description

  • Developed by: Aktheroy
  • Model type: Transformer-based encoder-decoder
  • Language(s) (NLP): English, Hindi, Greek
  • License: MIT
  • Finetuned from model: facebook/m2m100_418M

Model Sources

Uses

Direct Use

The model can be used for translation tasks between the supported languages (English, Hindi, Greek). Use cases include:

  • Cross-lingual communication
  • Multilingual content generation
  • Language learning assistance

Downstream Use

The model can be fine-tuned further for domain-specific translation tasks, such as medical or legal translations.

Out-of-Scope Use

The model is not suitable for:

  • Translating unsupported languages
  • Generating content for sensitive or harmful purposes

Bias, Risks, and Limitations

While the model supports multilingual translations, it might exhibit:

  • Biases from the pretraining and fine-tuning datasets.
  • Reduced performance for idiomatic expressions or cultural nuances.

Recommendations

Users should:

  • Verify translations, especially for critical applications.
  • Use supplementary tools to validate outputs in sensitive scenarios.

How to Get Started with the Model

Here is an example of how to use the model for translation tasks:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "aktheroy/4bit_translate_en_el_hi
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Example input
input_text = "Hello, how are you?"
tokenizer.src_lang = "en"
tokenizer.tgt_lang = "hi"

# Tokenize and generate output
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
translation = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(translation)

Training Details

Training Data

The model was fine-tuned on a custom dataset containing parallel translations between English, Hindi, and Greek.

Training Procedure

Preprocessing

The dataset was preprocessed to:

  • Normalize text.
  • Tokenize using the M2M100 tokenizer.

Training Hyperparameters

  • Epochs: 10
  • Batch size: 16
  • Learning rate: 5e-5
  • Mixed Precision: Disabled (FP32 used)

Speeds, Sizes, Times

  • Training runtime: 20.3 hours
  • Training samples per second: 17.508
  • Training steps per second: 0.137
  • Final training loss: 0.873

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a held-out test set from the same domains as the training data.

Metrics

  • BLEU score (to be computed during final evaluation).

Results

  • Training Loss: 0.873
  • Detailed BLEU score results will be provided in subsequent updates.

Environmental Impact

  • Hardware Type: MacBook with M3 Pro
  • Hours used: 20.3 hours
  • Cloud Provider: Local hardware
  • Carbon Emitted: Minimal (local training)

Technical Specifications

Model Architecture and Objective

The model is based on the M2M100 architecture, a transformer-based encoder-decoder model designed for multilingual translation without relying on English as an intermediary language.

Compute Infrastructure

Hardware

  • Device: MacBook with M3 Pro

Software

  • Transformers library from Hugging Face
  • Python 3.12

Citation

If you use this model, please cite it as:

APA: Aktheroy (2025). Fine-Tuned M2M100 Translation Model. Hugging Face. Retrieved from https://huggingface.co/aktheroy/FT_Translate_en_el_hi

Model Card Authors

  • Aktheroy

Model Card Contact

For questions or feedback, contact the author via Hugging Face.

Downloads last month
156
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the HF Inference API does not support peft models with pipeline type translation

Model tree for aktheroy/4bit_translate_en_el_hi

Adapter
(2)
this model