maanka2/english-somali-parallel-corpus
Viewer โข Updated โข 8.85k โข 81
This model is a fine-tuned version of the English-to-Somali Seq2Seq translation model (opus-mt-en-so), trained using specialized parallel datasets. It has been optimized using bfloat16 precision and is designed to translate English text into natural, grammatical Somali.
The model was trained on parallel translation corpus data extracted from textual files aligned line-by-line.
You can easily run inference with this model using the Hugging Face transformers library:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "maanka2/opus-mt-en-so-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
text = "Hello, how are you today?"
inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True)
outputs = model.generate(**inputs)
translation = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(translation)
Base model
Helsinki-NLP/opus-mt-synthetic-en-so