Back to all models
translation mask_token:
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚡️ Upgrade your account to access the Inference API

Share Copied link to clipboard

Monthly model downloads

Helsinki-NLP/opus-mt-trk-en Helsinki-NLP/opus-mt-trk-en
N/a downloads
last 30 days

pytorch

tf

Contributed by

Language Technology Research Group at the University of Helsinki university
1 team member · 1325 models

How to use this model directly from the 🤗/transformers library:

			
Copy to clipboard
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-trk-en") model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-trk-en")
Uploaded in S3

trk-eng

  • source group: Turkic languages

  • target group: English

  • OPUS readme: trk-eng

  • model: transformer

  • source language(s): aze_Latn bak chv crh crh_Latn kaz_Cyrl kaz_Latn kir_Cyrl kjh kum ota_Arab ota_Latn sah tat tat_Arab tat_Latn tuk tuk_Latn tur tyv uig_Arab uig_Cyrl uzb_Cyrl uzb_Latn

  • target language(s): eng

  • model: transformer

  • pre-processing: normalization + SentencePiece (spm32k,spm32k)

  • download original weights: opus2m-2020-08-01.zip

  • test set translations: opus2m-2020-08-01.test.txt

  • test set scores: opus2m-2020-08-01.eval.txt

Benchmarks

testset BLEU chr-F
newsdev2016-entr-tureng.tur.eng 5.0 0.242
newstest2016-entr-tureng.tur.eng 3.7 0.231
newstest2017-entr-tureng.tur.eng 3.7 0.229
newstest2018-entr-tureng.tur.eng 4.1 0.230
Tatoeba-test.aze-eng.aze.eng 15.1 0.330
Tatoeba-test.bak-eng.bak.eng 3.3 0.185
Tatoeba-test.chv-eng.chv.eng 1.3 0.161
Tatoeba-test.crh-eng.crh.eng 10.8 0.325
Tatoeba-test.kaz-eng.kaz.eng 9.6 0.264
Tatoeba-test.kir-eng.kir.eng 15.3 0.328
Tatoeba-test.kjh-eng.kjh.eng 1.8 0.121
Tatoeba-test.kum-eng.kum.eng 16.1 0.277
Tatoeba-test.multi.eng 12.0 0.304
Tatoeba-test.ota-eng.ota.eng 2.0 0.149
Tatoeba-test.sah-eng.sah.eng 0.7 0.140
Tatoeba-test.tat-eng.tat.eng 4.0 0.215
Tatoeba-test.tuk-eng.tuk.eng 5.5 0.243
Tatoeba-test.tur-eng.tur.eng 26.8 0.443
Tatoeba-test.tyv-eng.tyv.eng 1.3 0.111
Tatoeba-test.uig-eng.uig.eng 0.2 0.111
Tatoeba-test.uzb-eng.uzb.eng 4.6 0.195

System Info:

  • hf_name: trk-eng

  • source_languages: trk

  • target_languages: eng

  • opus_readme_url: https://github.com/Helsinki-NLP/Tatoeba-Challenge/tree/master/models/trk-eng/README.md

  • original_repo: Tatoeba-Challenge

  • tags: ['translation']

  • languages: ['tt', 'cv', 'tk', 'tr', 'ba', 'trk', 'en']

  • src_constituents: {'kir_Cyrl', 'tat_Latn', 'tat', 'chv', 'uzb_Cyrl', 'kaz_Latn', 'aze_Latn', 'crh', 'kjh', 'uzb_Latn', 'ota_Arab', 'tuk_Latn', 'tuk', 'tat_Arab', 'sah', 'tyv', 'tur', 'uig_Arab', 'crh_Latn', 'kaz_Cyrl', 'uig_Cyrl', 'kum', 'ota_Latn', 'bak'}

  • tgt_constituents: {'eng'}

  • src_multilingual: True

  • tgt_multilingual: False

  • prepro: normalization + SentencePiece (spm32k,spm32k)

  • url_model: https://object.pouta.csc.fi/Tatoeba-MT-models/trk-eng/opus2m-2020-08-01.zip

  • url_test_set: https://object.pouta.csc.fi/Tatoeba-MT-models/trk-eng/opus2m-2020-08-01.test.txt

  • src_alpha3: trk

  • tgt_alpha3: eng

  • short_pair: trk-en

  • chrF2_score: 0.304

  • bleu: 12.0

  • brevity_penalty: 1.0

  • ref_len: 18733.0

  • src_name: Turkic languages

  • tgt_name: English

  • train_date: 2020-08-01

  • src_alpha2: trk

  • tgt_alpha2: en

  • prefer_old: False

  • long_pair: trk-eng

  • helsinki_git_sha: 480fcbe0ee1bf4774bcbe6226ad9f58e63f6c535

  • transformers_git_sha: 2207e5d8cb224e954a7cba69fa4ac2309e9ff30b

  • port_machine: brutasse

  • port_time: 2020-08-21-14:41