Instructions to use ilsp/m2m100-1.2B-ag-mg-qlora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ilsp/m2m100-1.2B-ag-mg-qlora with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="ilsp/m2m100-1.2B-ag-mg-qlora")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ilsp/m2m100-1.2B-ag-mg-qlora", dtype="auto") - PEFT
How to use ilsp/m2m100-1.2B-ag-mg-qlora with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
M2M100-1.2B for Ancient Greek to Modern Greek (QLoRA)
This model is a fine-tuned version of facebook/m2m100_1.2B for translating Ancient Greek to Modern Greek.
It was fine-tuned using QLoRA (4-bit Quantization + LoRA) on the sentence-level AG-MG Parallel Corpus.
The tokenizer has been expanded with 122 Ancient Greek characters (Polytonic) that were missing from the original M2M100 vocabulary and are essential for handling the source text correctly.
This model was trained by Spyridon Mavromatis at the Institute for Language and Speech Processing (ILSP), "Athena" RC, and the National and Kapodistrian University of Athens (NKUA) as part of an M.Sc. thesis.
Model Details
Base Model: facebook/m2m100_1.2B
Method: QLoRA (Rank=16, Alpha=32, 4-bit NF4)
Vocabulary: Expanded with 122 Polytonic Greek characters.
Training Data: ~130k sentence pairs from the AG-MG Corpus.
Usage
You need to load the base model, resize the embeddings, and then load the Peft adapter. If you want to load the base model in 4-bit you need bitsandbytes installed.
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, BitsAndBytesConfig
from peft import PeftModel
# 1. Configuration
adapter_repo = "ilsp/m2m100-1.2B-ag-mg-qlora"
base_model_id = "facebook/m2m100_1.2B"
# 2. Load Tokenizer (from adapter repo for added tokens)
tokenizer = AutoTokenizer.from_pretrained(adapter_repo, src_lang="el") # We treat AG as 'el' script-wise
# 3. Load Base Model in 4-bit
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True
)
model = AutoModelForSeq2SeqLM.from_pretrained(
base_model_id,
quantization_config=bnb_config,
device_map="auto"
)
# 4. Resize Embeddings (Critical)
model.resize_token_embeddings(len(tokenizer))
# 5. Load Adapter
model = PeftModel.from_pretrained(model, adapter_repo)
model.eval()
# 6. Inference
text = "Ὦ ξεῖν', ἀγγέλλειν Λακεδαιμονίοις ὅτι τῇδε κείμεθα."
inputs = tokenizer(text, return_tensors="pt").to(model.device)
# Force target language to Modern Greek ('el')
forced_bos_token_id = tokenizer.get_lang_id("el")
translated_tokens = model.generate(
**inputs,
forced_bos_token_id=forced_bos_token_id,
max_length=128
)
print(tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0])
Performance
Main Test Set Results
Evaluated on the 2,000 sentence-pairs Test Set (Attic & Koine Hellenistic dialects).
| Model | Method | BLEU ↑ | chrF++ ↑ | TER ↓ | BERTScore F1 ↑ | COMET ↑ | ΔBLEU |
|---|---|---|---|---|---|---|---|
| NLLB-600M | Base | 1.55 | 16.86 | 106.80 | 0.880 | 0.539 | - |
| LoRA | 7.43 | 29.31 | 88.32 | 0.903 | 0.667 | +5.88 | |
| NLLB-1.3B | Base | 2.15 | 17.78 | 106.41 | 0.885 | 0.573 | - |
| LoRA | 8.01 | 30.02 | 87.74 | 0.905 | 0.687 | +5.86 | |
| M2M100-1.2B | Base | 0.62 | 10.70 | 100.50 | 0.858 | 0.475 | - |
| 👉 | QLoRA | 10.96 | 33.09 | 82.99 | 0.911 | 0.710 | +10.34 |
| Full FT | 9.60 | 31.16 | 83.43 | 0.908 | 0.692 | +8.98 | |
| Krikri-8B-Instruct | Base | 8.29 | 29.87 | 88.13 | 0.895 | 0.695 | - |
| QLoRA | 11.90 | 34.07 | 84.16 | 0.906 | 0.713 | +3.60 | |
| Full FT | 13.16 | 34.71 | 83.68 | 0.848 | 0.702 | +4.45 |
Stress Set Results (Rare Dialects)
Evaluated on the 250 sentence-pairs Stress Set (Ionic, Doric, Homeric dialects).
| Model | Method | BLEU ↑ | chrF++ ↑ | TER ↓ | BERTScore F1 ↑ | COMET ↑ | ΔBLEU |
|---|---|---|---|---|---|---|---|
| NLLB-600M | Base | 0.77 | 14.40 | 118.13 | 0.866 | 0.484 | - |
| LoRA | 5.65 | 28.74 | 88.01 | 0.900 | 0.638 | +4.89 | |
| NLLB-1.3B | Base | 1.25 | 16.15 | 107.03 | 0.873 | 0.525 | - |
| LoRA | 5.68 | 28.94 | 88.24 | 0.900 | 0.656 | +4.43 | |
| M2M100-1.2B | Base | 0.07 | 9.37 | 100.34 | 0.840 | 0.427 | - |
| 👉 | QLoRA | 9.52 | 33.30 | 81.95 | 0.911 | 0.691 | +9.45 |
| Full FT | 8.16 | 31.12 | 83.11 | 0.907 | 0.664 | +8.09 | |
| Krikri-8B-Instruct | Base | 6.55 | 28.98 | 87.38 | 0.900 | 0.675 | - |
| QLoRA | 10.37 | 34.09 | 82.28 | 0.911 | 0.717 | +3.82 | |
| Full FT | 12.80 | 35.90 | 81.40 | 0.884 | 0.716 | +6.11 |
Citation
If you use this model, please cite our LREC 2026 paper:
Mavromatis, S., Sofianopoulos, S., Prokopidis, P., & Giagkou, M. (2026). Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models. In Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026) (pp. 8685–8698). European Language Resources Association (ELRA). https://doi.org/10.63317/4cdk64dgm2w9
@inproceedings{mavromatis-etal-2026-ancient,
title = {Ancient Greek to Modern Greek Machine Translation: A Novel Benchmark and Fine-Tuning Experiments on LLMs and NMT Models},
author = {Mavromatis, Spyridon and Sofianopoulos, Sokratis and Prokopidis, Prokopis and Giagkou, Maria},
booktitle = {Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026)},
month = {May},
year = {2026},
pages = {8685--8698},
address = {Palma, Mallorca, Spain},
publisher = {European Language Resources Association (ELRA)},
editor = {Piperidis, Stelios and Bel, Núria and van den Heuvel, Henk and Ide, Nancy and Krek, Simon and Toral, Antonio},
doi = {10.63317/4cdk64dgm2w9}
}
Note on resources: The fine-tuned models are publicly released. The accompanying AG-MG Parallel Corpus is not publicly distributed due to the complex and uncertain copyright status of the source materials.
Model tree for ilsp/m2m100-1.2B-ag-mg-qlora
Base model
facebook/m2m100_1.2B