File size: 1,266 Bytes

---
license: cc-by-sa-4.0

language: 
- en
- ja

tags:
- translation

---

# FuguMT

This is a translation model using Marian-NMT.
For more details, please see [my repository](https://github.com/s-taka/fugumt).

* source language: en
* target language: ja 

### How to use

This model uses transformers and sentencepiece.
```python
!pip install transformers sentencepiece
```

You can use this model directly with a pipeline:
```python
from transformers import pipeline
fugu_translator = pipeline('translation', model='staka/fugumt-en-ja')
fugu_translator('This is a cat.')
```

If you want to translate multiple sentences, we recommend using [pySBD](https://github.com/nipunsadvilkar/pySBD).
```python
!pip install transformers sentencepiece pysbd

import pysbd
seg_en = pysbd.Segmenter(language="en", clean=False)

from transformers import pipeline
fugu_translator = pipeline('translation', model='staka/fugumt-en-ja')
txt = 'This is a cat. It is very cute.'
print(fugu_translator(seg_en.segment(txt)))
```


### Eval results

The results of the evaluation using [tatoeba](https://tatoeba.org/ja)(randomly selected 500 sentences) are as follows:

|source |target |BLEU(*1)| 
|-------|-------|--------|
|en     |ja     |32.7    |

(*1) sacrebleu --tokenize ja-mecab