Opus MT English <-> Lojban

This is the first neural machine translation model that can translate from English to Lojban (and vice-versa!). Fine-tuned from this Opus MT model, this model is designed for translation tasks on any device!

Features

  • The First Lojban Translation Model: The first neural machine translation model (that we know of) that supports Lojban! We're going to support every single language on Earth!
  • Tiny Size: Beats any other large model on speed and memory usage. No other model is able to compete with this!

Notes

  • BLEU on the validation split (yes, 200 sentences per each pair) is generally impressive, but not perfect!
  • As per the base model, to translate English to Lojban put ">>jbo<<" at the start then your text. To translate in the opposite direction put ">>en<<" at the start instead!

Evaluation Results

Direction BLEU (on val split, 200 sentences per each pair)
English -> Lojban 40.33
Lojban -> English 45.18

Usage

Code is by Colab's Auto-Completion (then some little modifications by myself):

# Translate with a Opus MT model!
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("MihaiPopa-1/opus-mt-en-jbo")
model = AutoModelForSeq2SeqLM.from_pretrained("MihaiPopa-1/opus-mt-en-jbo")

text = ">>jbo<< The password is \"Mihai Popa\" "
# text = ">>en<< lo pamoi cu zo'e \"Mihai Popa\"" (for the opposite direction)

input_ids = tokenizer.encode(text, return_tensors="pt")

outputs = model.generate(
    input_ids,
    num_beams=5
)
decoded_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(decoded_text)

Data Used

I used Tatoeba's latest snapshot!

Downloads last month
120
Safetensors
Model size
77.5M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MihaiPopa-1/opus-mt-en-jbo

Finetuned
(31)
this model