language:
- en
- pl
- multilingual
license: apache-2.0
tags:
- translation
OPUS Tatoeba English-Polish
Update: The model is currently not functional. Please refer to the original checkpoint in the Tatoeba repository for a working version
This model was obtained by running the script convert_marian_to_pytorch.py with the flag -m eng-pol
. The original models were trained by J�rg Tiedemann using the MarianNMT library. See all available MarianMTModel
models on the profile of the Helsinki NLP group.
source language name: English
target language name: Polish
OPUS readme: README.md
model: transformer
source language code: en
target language code: pl
dataset: opus
release date: 2021-02-19
pre-processing: normalization + SentencePiece (spm32k,spm32k)
download original weights: opus-2021-02-19.zip
Training data:
- eng-pol: Tatoeba-train (59742979)
Validation data:
- eng-pol: Tatoeba-dev, 44146
- total-size-shuffled: 44145
- devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
Test data:
- Tatoeba-test.eng-pol: 10000/64925
test set translations file: test.txt
test set scores file: eval.txt
BLEU-scores
Test set score Tatoeba-test.eng-pol 47.5 chr-F-scores
Test set score Tatoeba-test.eng-pol 0.673
System Info:
- hf_name: eng-pol
- source_languages: en
- target_languages: pl
- opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/eng-pol/opus-2021-02-19.zip/README.md
- original_repo: Tatoeba-Challenge
- tags: ['translation']
- languages: ['en', 'pl']
- src_constituents: ['eng']
- tgt_constituents: ['pol']
- src_multilingual: False
- tgt_multilingual: False
- helsinki_git_sha: 70b0a9621f054ef1d8ea81f7d55595d7f64d19ff
- transformers_git_sha: 7c6cd0ac28f1b760ccb4d6e4761f13185d05d90b
- port_machine: databox
- port_time: 2021-10-18-15:11