File size: 2,358 Bytes
a2b1177
 
 
 
44c1f30
 
a2b1177
 
 
 
 
 
44c1f30
a2b1177
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
language:
- en
- pl
- multilingual
license: apache-2.0
tags:
- translation
---

### OPUS Tatoeba English-Polish

*This model was obtained by running the script [convert_marian_to_pytorch.py](https://github.com/huggingface/transformers/blob/master/src/transformers/models/marian/convert_marian_to_pytorch.py) with the flag `-m eng-pol`. The original models were trained by [J�rg Tiedemann](https://blogs.helsinki.fi/tiedeman/) using the [MarianNMT](https://marian-nmt.github.io/) library. See all available `MarianMTModel` models on the profile of the [Helsinki NLP](https://huggingface.co/Helsinki-NLP) group.*

* source language name: English
* target language name: Polish
* OPUS readme: [README.md](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-pol/README.md)

* model: transformer
* source language code: en
* target language code: pl
* dataset: opus 
* release date: 2021-02-19
* pre-processing: normalization + SentencePiece (spm32k,spm32k)
* download original weights: [opus-2021-02-19.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-pol/opus-2021-02-19.zip/eng-pol/opus-2021-02-19.zip)
* Training data: 
  * eng-pol: Tatoeba-train (59742979)
* Validation data: 
  * eng-pol: Tatoeba-dev, 44146
  * total-size-shuffled: 44145
  * devset-selected: top 5000  lines of Tatoeba-dev.src.shuffled!
* Test data: 
  * Tatoeba-test.eng-pol: 10000/64925
* test set translations file: [test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-pol/opus-2021-02-19.zip/eng-pol/opus-2021-02-19.test.txt)
* test set scores file: [eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-pol/opus-2021-02-19.zip/eng-pol/opus-2021-02-19.eval.txt)
* BLEU-scores
|Test set|score|
|---|---|
|Tatoeba-test.eng-pol|47.5|
* chr-F-scores
|Test set|score|
|---|---|
|Tatoeba-test.eng-pol|0.673|

### System Info: 
* hf_name: eng-pol
* source_languages: en
* target_languages: pl
* opus_readme_url: https://object.pouta.csc.fi/Tatoeba-MT-models/eng-pol/opus-2021-02-19.zip/README.md
* original_repo: Tatoeba-Challenge
* tags: ['translation']
* languages: ['en', 'pl']
* src_constituents: ['eng']
* tgt_constituents: ['pol']
* src_multilingual: False
* tgt_multilingual: False
* helsinki_git_sha: 70b0a9621f054ef1d8ea81f7d55595d7f64d19ff
* transformers_git_sha: 7c6cd0ac28f1b760ccb4d6e4761f13185d05d90b
* port_machine: databox
* port_time: 2021-10-18-15:11