--- language: - gmw - gmw tags: - translation license: CC-BY 4.0 --- # opus-mt-tc-base-gmw-gmw Neural machine translation model for translating from West Germanic languages to West Germanic languages. This model is part of the [OPUS-MT project](https://github.com/Helsinki-NLP/Opus-MT), an effort to make neural machine translation models widely available and accessible for many languages in the world. All models are originally trained using the amazing framework of [Marian NMT](https://marian-nmt.github.io/), an efficient NMT implementation writtin in pure C++. The models have been converted to pyTorch using the transformers library by huggingface. Training data is taken from [OPUS](https://opus.nlpl.eu/) and training pipelines use the procedures of [OPUS-MT-train](https://github.com/Helsinki-NLP/Opus-MT-train). * Publications: [OPUS-MT – Building open translation services for the World](https://aclanthology.org/2020.eamt-1.61/) , [The Tatoeba Translation Challenge – Realistic Data Sets for Low Resource and Multilingual MT](https://aclanthology.org/2020.wmt-1.139/) ## Model info * Release: 2021-02-23 * source language(s): afr deu eng fry gos hrx ltz nds nld pdc yid * target language(s): afr deu eng fry nds nld * valid target language labels: >>afr<< >>ang_Latn<< >>deu<< >>eng<< >>fry<< >>ltz<< >>nds<< >>nld<< >>sco<< >>yid<< * model: transformer * data: opus * tokenization: SentencePiece (spm32k,spm32k) * original model: [opus-2021-02-23.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/gmw-gmw/opus-2021-02-23.zip) This is a multilingual translation model with multiple target languages. A sentence initial language token is required in the form of `>>id<<` (id = valid target language ID), e.g. `>>afr<<` ## Usage You can use OPUS-MT models with the transformers pipelines, for example: ```python from transformers import pipeline pipe = pipeline("translation", model="Helsinki-NLP/opus-mt-tc-base-gmw-gmw") print(pipe(">>afr<< Replace this with text in an accepted source language.") ``` ## Benchmarks | langpair | testset | BLEU | chr-F | #sent | #words | BP | |----------|---------|-------|-------|-------|--------|----| | afr-deu | Tatoeba-test | 48.5 | 0.677 | 1583 | 9105 | 1.000 | | afr-eng | Tatoeba-test | 58.7 | 0.727 | 1374 | 9622 | 0.995 | | afr-nld | Tatoeba-test | 54.7 | 0.713 | 1056 | 6710 | 0.989 | | deu-afr | Tatoeba-test | 52.4 | 0.697 | 1583 | 9507 | 1.000 | | deu-eng | newssyscomb2009 | 25.4 | 0.527 | 502 | 11821 | 0.986 | | deu-eng | news-test2008 | 23.9 | 0.519 | 2051 | 49380 | 0.992 | | deu-eng | newstest2009 | 23.5 | 0.517 | 2525 | 65402 | 0.978 | | deu-eng | newstest2010 | 26.1 | 0.548 | 2489 | 61724 | 1.000 | | deu-eng | newstest2011 | 23.9 | 0.525 | 3003 | 74681 | 1.000 | | deu-eng | newstest2012 | 25.0 | 0.533 | 3003 | 72812 | 1.000 | | deu-eng | newstest2013 | 27.7 | 0.549 | 3000 | 64505 | 1.000 | | deu-eng | newstest2014-deen | 27.4 | 0.549 | 3003 | 67337 | 0.977 | | deu-eng | newstest2015-ende | 28.8 | 0.554 | 2169 | 46443 | 0.973 | | deu-eng | newstest2016-ende | 33.7 | 0.598 | 2999 | 64126 | 1.000 | | deu-eng | newstest2017-ende | 29.6 | 0.562 | 3004 | 64399 | 0.979 | | deu-eng | newstest2018-ende | 36.3 | 0.611 | 2998 | 67013 | 0.977 | | deu-eng | newstest2019-deen | 32.7 | 0.585 | 2000 | 39282 | 0.984 | | deu-eng | Tatoeba-test | 44.7 | 0.629 | 10000 | 81233 | 0.975 | | deu-nds | Tatoeba-test | 18.7 | 0.444 | 10000 | 76144 | 0.988 | | deu-nld | Tatoeba-test | 48.7 | 0.672 | 10000 | 73546 | 0.969 | | eng-afr | Tatoeba-test | 56.5 | 0.735 | 1374 | 10317 | 0.984 | | eng-deu | newssyscomb2009 | 19.4 | 0.503 | 502 | 11271 | 0.991 | | eng-deu | news-test2008 | 19.5 | 0.493 | 2051 | 47427 | 0.996 | | eng-deu | newstest2009 | 18.8 | 0.499 | 2525 | 62816 | 0.993 | | eng-deu | newstest2010 | 20.8 | 0.509 | 2489 | 61511 | 0.958 | | eng-deu | newstest2011 | 19.2 | 0.493 | 3003 | 72981 | 0.980 | | eng-deu | newstest2012 | 19.6 | 0.494 | 3003 | 72886 | 0.960 | | eng-deu | newstest2013 | 22.8 | 0.518 | 3000 | 63737 | 0.974 | | eng-deu | newstest2015-ende | 25.8 | 0.545 | 2169 | 44260 | 1.000 | | eng-deu | newstest2016-ende | 30.3 | 0.581 | 2999 | 62670 | 0.989 | | eng-deu | newstest2017-ende | 24.2 | 0.537 | 3004 | 61291 | 1.000 | | eng-deu | newstest2018-ende | 35.5 | 0.616 | 2998 | 64276 | 1.000 | | eng-deu | newstest2019-ende | 31.6 | 0.586 | 1997 | 48969 | 0.973 | | eng-deu | Tatoeba-test | 37.8 | 0.591 | 10000 | 83347 | 0.991 | | eng-nds | Tatoeba-test | 16.5 | 0.411 | 2500 | 18264 | 0.992 | | eng-nld | Tatoeba-test | 50.3 | 0.677 | 10000 | 71436 | 0.979 | | fry-deu | Tatoeba-test | 28.7 | 0.545 | 66 | 432 | 1.000 | | fry-eng | Tatoeba-test | 31.9 | 0.496 | 205 | 1500 | 1.000 | | fry-nld | Tatoeba-test | 43.0 | 0.634 | 233 | 1672 | 1.000 | | gos-nld | Tatoeba-test | 15.9 | 0.409 | 1852 | 9903 | 0.959 | | hrx-deu | Tatoeba-test | 24.7 | 0.487 | 471 | 2805 | 0.984 | | ltz-deu | Tatoeba-test | 36.6 | 0.552 | 337 | 2144 | 1.000 | | ltz-eng | Tatoeba-test | 31.4 | 0.477 | 283 | 1751 | 1.000 | | ltz-nld | Tatoeba-test | 37.5 | 0.523 | 273 | 1567 | 1.000 | | multi-multi | Tatoeba-test | 37.1 | 0.569 | 10000 | 73153 | 1.000 | | nds-deu | Tatoeba-test | 34.5 | 0.572 | 10000 | 74571 | 1.000 | | nds-eng | Tatoeba-test | 29.6 | 0.492 | 2500 | 17589 | 1.000 | | nds-nld | Tatoeba-test | 42.2 | 0.621 | 1657 | 11490 | 0.994 | | nld-afr | Tatoeba-test | 59.0 | 0.756 | 1056 | 6823 | 1.000 | | nld-deu | Tatoeba-test | 50.6 | 0.688 | 10000 | 72438 | 1.000 | | nld-eng | Tatoeba-test | 54.5 | 0.702 | 10000 | 69848 | 0.975 | | nld-fry | Tatoeba-test | 23.3 | 0.462 | 233 | 1679 | 1.000 | | nld-nds | Tatoeba-test | 21.7 | 0.462 | 1657 | 11711 | 0.998 | | pdc-eng | Tatoeba-test | 24.3 | 0.402 | 53 | 399 | 1.000 | | yid-nld | Tatoeba-test | 21.3 | 0.402 | 55 | 323 | 1.000 | * test set translations: [opus-2021-02-23.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/gmw-gmw/opus-2021-02-23.test.txt) * test set scores: [opus-2021-02-23.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/gmw-gmw/opus-2021-02-23.eval.txt) ## Model conversion info * transformers version: 4.12.3 * OPUS-MT git hash: fc19512 * port time: Thu Jan 27 18:04:00 EET 2022 * port machine: LM0-400-22516.local