takomt / README.md
staka's picture
Add multilingual to the language tag (#1)
6ffc15f
|
raw
history blame
1.16 kB
metadata
language:
  - de
  - en
  - es
  - fr
  - it
  - ja
  - ru
  - uk
  - multilingual
license: cc-by-sa-4.0
tags:
  - translation

TakoMT

This is a translation model using Marian-NMT. For more details, please see my repository.

In addition to the data listed in the repository I also used ParaCrawl.

  • source languages: de, en, es, fr, it, ru, uk
  • target language: ja

How to use

This model uses transformers and sentencepiece.

!pip install transformers sentencepiece

You can use this model directly with a pipeline:

from transformers import pipeline
tako_translator = pipeline('translation', model='staka/takomt')
tako_translator('This is a cat.')

Eval results

The results of the evaluation using tatoeba(randomly selected 500 sentences) are as follows:

source target BLEU(*1)
de ja 27.8
en ja 28.4
es ja 32.0
fr ja 27.9
it ja 24.3
ru ja 27.3
uk ja 29.8

(*1) sacrebleu --tokenize ja-mecab