Edit model card

JoeyNMT: iwslt14 de-en-fr multilingual

This is a JoeyNMT model for multilingual MT with language tags, built for a demo purpose. The model is trained on iwslt14 de-en / en-fr parallel data using DDP.

Install JoeyNMT v2.3:

$ pip install git+https://github.com/joeynmt/joeynmt.git

Translation

Torch hub interface:

import torch

iwslt14 = torch.hub.load("joeynmt/joeynmt", "iwslt14_prompt")
translation = iwslt14.translate(
    src=["Hello world!"],  # src sentence
    src_prompt=["<en>"],   # src language code
    trg_prompt=["<de>"],   # trg language code
    beam_size=1,
)
print(translation)  # ["Hallo Welt!"]

(See jupyter notebook for details)

Training

$ python -m joeynmt train iwslt14_prompt/config.yaml --use-ddp --skip-test

(See train.log for details)

Evaluation

$ git clone https://huggingface.co/may-ohta/iwslt14_prompt
$ python -m joeynmt test iwslt14_prompt/config.yaml --output-path iwslt14_prompt/hyp
direction bleu
en->de 28.88
de->en 35.28
en->fr 38.86
fr->en 40.35
  • beam_size: 5
  • beam_alpha: 1.0
  • sacrebleu signature nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.4.0

(See test.log for details)

Data Format

We downloaded IWSLT14 de-en and en-fr from https://wit3.fbk.eu/2014-01 and created {train|dev|test}.tsv files in the following format:

src_prompt src trg_prompt trg
<en> Hello. <de> Hallo.
<de> Vielen Dank! <en> Thank you!

(See test.ref.de-en.tsv)

Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Dataset used to train may-ohta/iwslt14_prompt