Opus MT English <-> Lojban

This is the first neural machine translation model that can translate from English to Lojban (and vice-versa!). Fine-tuned from this Opus MT model, this model is designed for translation tasks on any device!

Features

The First Lojban Translation Model: The first neural machine translation model (that we know of) that supports Lojban! We're going to support every single language on Earth!
Tiny Size: Beats any other large model on speed and memory usage. No other model is able to compete with this!

Notes

BLEU on the validation split (yes, 200 sentences per each pair) is generally impressive, but not perfect!
As per the base model, to translate English to Lojban put ">>jbo<<" at the start then your text. To translate in the opposite direction put ">>en<<" at the start instead!

Evaluation Results

Direction	BLEU (on val split, 200 sentences per each pair)
English -> Lojban	40.33
Lojban -> English	45.18

Usage

Code is by Colab's Auto-Completion (then some little modifications by myself):

# Translate with a Opus MT model!
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("MihaiPopa-1/opus-mt-en-jbo")
model = AutoModelForSeq2SeqLM.from_pretrained("MihaiPopa-1/opus-mt-en-jbo")

text = ">>jbo<< The password is \"Mihai Popa\" "
# text = ">>en<< lo pamoi cu zo'e \"Mihai Popa\"" (for the opposite direction)

input_ids = tokenizer.encode(text, return_tensors="pt")

outputs = model.generate(
    input_ids,
    num_beams=5
)
decoded_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(decoded_text)

Data Used

I used Tatoeba's latest snapshot!

Downloads last month: 120

Safetensors

Model size

77.5M params

Tensor type

F32

Model tree for MihaiPopa-1/opus-mt-en-jbo

Base model

Helsinki-NLP/opus-mt-en-ROMANCE

Finetuned

(31)

this model