Back to all models
translation mask_token:
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚡️ Upgrade your account to access the Inference API

							curl -X POST \
-H "Authorization: Bearer YOUR_ORG_OR_USER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '"json encoded string"' \
Share Copied link to clipboard

Monthly model downloads

Helsinki-NLP/opus-mt-sem-en Helsinki-NLP/opus-mt-sem-en
last 30 days



Contributed by

Language Technology Research Group at the University of Helsinki university
1 team member · 1323 models

How to use this model directly from the 🤗/transformers library:

Copy to clipboard
from transformers import AutoTokenizer, AutoModelWithLMHead tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-sem-en") model = AutoModelWithLMHead.from_pretrained("Helsinki-NLP/opus-mt-sem-en")
Uploaded in S3


  • source group: Semitic languages

  • target group: English

  • OPUS readme: sem-eng

  • model: transformer

  • source language(s): acm afb amh apc ara arq ary arz heb mlt tir

  • target language(s): eng

  • model: transformer

  • pre-processing: normalization + SentencePiece (spm32k,spm32k)

  • download original weights:

  • test set translations: opus2m-2020-08-01.test.txt

  • test set scores: opus2m-2020-08-01.eval.txt


testset BLEU chr-F
Tatoeba-test.amh-eng.amh.eng 37.5 0.565
Tatoeba-test.ara-eng.ara.eng 38.9 0.566
Tatoeba-test.heb-eng.heb.eng 44.6 0.610
Tatoeba-test.mlt-eng.mlt.eng 53.7 0.688
Tatoeba-test.multi.eng 41.7 0.588
Tatoeba-test.tir-eng.tir.eng 18.3 0.370

System Info: