Spaces:

mshukor
/

UnIVAL

Running

App Files Files Community

UnIVAL / fairseq /examples /simultaneous_translation /docs /ende-mma.md

mshukor

init

26fd00c over 1 year ago

preview code

raw

history blame

2.43 kB

	# Simultaneous Machine Translation

	This directory contains the code for the paper [Monotonic Multihead Attention](https://openreview.net/forum?id=Hyg96gBKPS)

	## Prepare Data

	[Please follow the instructions to download and preprocess the WMT'15 En-De dataset.](https://github.com/pytorch/fairseq/tree/simulastsharedtask/examples/translation#prepare-wmt14en2desh)

	Another example of training an English to Japanese model can be found [here](docs/enja.md)

	## Training

	- MMA-IL

	```shell
	fairseq-train \
	data-bin/wmt15_en_de_32k \
	--simul-type infinite_lookback \
	--user-dir $FAIRSEQ/example/simultaneous_translation \
	--mass-preservation \
	--criterion latency_augmented_label_smoothed_cross_entropy \
	--latency-weight-avg 0.1 \
	--max-update 50000 \
	--arch transformer_monotonic_iwslt_de_en save_dir_key=lambda \
	--optimizer adam --adam-betas '(0.9, 0.98)' \
	--lr-scheduler 'inverse_sqrt' \
	--warmup-init-lr 1e-7 --warmup-updates 4000 \
	--lr 5e-4 --stop-min-lr 1e-9 --clip-norm 0.0 --weight-decay 0.0001\
	--dropout 0.3 \
	--label-smoothing 0.1\
	--max-tokens 3584
	```

	- MMA-H

	```shell
	fairseq-train \
	data-bin/wmt15_en_de_32k \
	--simul-type hard_aligned \
	--user-dir $FAIRSEQ/example/simultaneous_translation \
	--mass-preservation \
	--criterion latency_augmented_label_smoothed_cross_entropy \
	--latency-weight-var 0.1 \
	--max-update 50000 \
	--arch transformer_monotonic_iwslt_de_en save_dir_key=lambda \
	--optimizer adam --adam-betas '(0.9, 0.98)' \
	--lr-scheduler 'inverse_sqrt' \
	--warmup-init-lr 1e-7 --warmup-updates 4000 \
	--lr 5e-4 --stop-min-lr 1e-9 --clip-norm 0.0 --weight-decay 0.0001\
	--dropout 0.3 \
	--label-smoothing 0.1\
	--max-tokens 3584
	```

	- wait-k

	```shell
	fairseq-train \
	data-bin/wmt15_en_de_32k \
	--simul-type wait-k \
	--waitk-lagging 3 \
	--user-dir $FAIRSEQ/example/simultaneous_translation \
	--mass-preservation \
	--criterion latency_augmented_label_smoothed_cross_entropy \
	--max-update 50000 \
	--arch transformer_monotonic_iwslt_de_en save_dir_key=lambda \
	--optimizer adam --adam-betas '(0.9, 0.98)' \
	--lr-scheduler 'inverse_sqrt' \
	--warmup-init-lr 1e-7 --warmup-updates 4000 \
	--lr 5e-4 --stop-min-lr 1e-9 --clip-norm 0.0 --weight-decay 0.0001\
	--dropout 0.3 \
	--label-smoothing 0.1\
	--max-tokens 3584
	```