kathay/runyoro-rutooro-en-parallel
Viewer • Updated • 4.52k • 109
Bidirectional Runyoro-Rutooro <-> English NMT
Fine-tuned from facebook/nllb-200-distilled-1.3B on
kathay/runyoro-rutooro-en-parallel.
| Metric | Score |
|---|---|
| BLEU | 1.1 |
| chrF++ | 11.0 |
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForSeq2SeqLM.from_pretrained("kathay/runyoro-nmt-v1")
tokenizer = AutoTokenizer.from_pretrained("kathay/runyoro-nmt-v1")
# Runyoro-Rutooro -> English
tokenizer.src_lang = "nyk_Latn"
inputs = tokenizer("Oraire ota?", return_tensors="pt")
# Use eng_Latn BOS (id=256047) for English output
eng_bos = tokenizer.convert_tokens_to_ids("eng_Latn")
out = model.generate(**inputs, forced_bos_token_id=eng_bos, num_beams=4)
print(tokenizer.decode(out[0], skip_special_tokens=True))
# English -> Runyoro-Rutooro
tokenizer.src_lang = "eng_Latn"
inputs = tokenizer("How are you?", return_tensors="pt")
# Use lug_Latn BOS (id=256110) — closest NLLB-supported Ugandan Bantu language
# nyk_Latn maps to <unk> in NLLB vocab, so lug_Latn produces Runyoro-like output
lug_bos = tokenizer.convert_tokens_to_ids("lug_Latn")
out = model.generate(**inputs, forced_bos_token_id=lug_bos, num_beams=4)
print(tokenizer.decode(out[0], skip_special_tokens=True))
tokenizer.src_lang = "nyk_Latn" inputs = tokenizer("Oraire ota?", return_tensors="pt") eng_bos = tokenizer.convert_tokens_to_ids("eng_Latn") out = model.generate(**inputs, forced_bos_token_id=eng_bos, num_beams=4) print(tokenizer.decode(out[0], skip_special_tokens=True))
tokenizer.src_lang = "eng_Latn" inputs = tokenizer("How are you?", return_tensors="pt") rny_bos = tokenizer.convert_tokens_to_ids("nyk_Latn") out = model.generate(**inputs, forced_bos_token_id=rny_bos, num_beams=4) print(tokenizer.decode(out[0], skip_special_tokens=True))
## Training Details
| Parameter | Value |
|-----------|-------|
| Base model | facebook/nllb-200-distilled-1.3B |
| Epochs | 15 |
| Batch size | 16 x 4 grad_accum |
| Learning rate | 5e-05 |
| BF16 | True |
| Curriculum | False |
| Hardware | 2x NVIDIA RTX 4090 (DataParallel) |
Base model
facebook/nllb-200-distilled-1.3B