gsarti/opus-mt-tc-en-pl · Hugging Face

OPUS Tatoeba English-Polish

Update: The model is currently not functional. Please refer to the original checkpoint in the Tatoeba repository for a working version

This model was obtained by running the script convert_marian_to_pytorch.py with the flag -m eng-pol. The original models were trained by J�rg Tiedemann using the MarianNMT library. See all available MarianMTModel models on the profile of the Helsinki NLP group.

source language name: English
target language name: Polish
OPUS readme: README.md
model: transformer
source language code: en
target language code: pl
dataset: opus
release date: 2021-02-19
pre-processing: normalization + SentencePiece (spm32k,spm32k)
download original weights: opus-2021-02-19.zip
Training data:
- eng-pol: Tatoeba-train (59742979)
Validation data:
- eng-pol: Tatoeba-dev, 44146
- total-size-shuffled: 44145
- devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
Test data:
- Tatoeba-test.eng-pol: 10000/64925
test set translations file: test.txt
test set scores file: eval.txt
BLEU-scores

Test set score

Tatoeba-test.eng-pol 47.5
chr-F-scores

Test set score

Tatoeba-test.eng-pol 0.673

Test set	score
Tatoeba-test.eng-pol	47.5

Test set	score
Tatoeba-test.eng-pol	0.673

System Info:

hf_name: eng-pol
source_languages: en
target_languages: pl
opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/eng-pol/opus-2021-02-19.zip/README.md
original_repo: Tatoeba-Challenge
tags: ['translation']
languages: ['en', 'pl']
src_constituents: ['eng']
tgt_constituents: ['pol']
src_multilingual: False
tgt_multilingual: False
helsinki_git_sha: 70b0a9621f054ef1d8ea81f7d55595d7f64d19ff
transformers_git_sha: 7c6cd0ac28f1b760ccb4d6e4761f13185d05d90b
port_machine: databox
port_time: 2021-10-18-15:11