2.67 kB
--- | |
language: | |
- de | |
- en | |
thumbnail: | |
tags: | |
- translation | |
- wmt19 | |
- allenai | |
license: apache-2.0 | |
datasets: | |
- wmt19 | |
metrics: | |
- bleu | |
--- | |
# FSMT | |
## Model description | |
This is a ported version of fairseq-based [wmt19 transformer](https://github.com/jungokasai/deep-shallow/) for de-en. | |
For more details, please, see [Deep Encoder, Shallow Decoder: Reevaluating the Speed-Quality Tradeoff in Machine Translation](https://arxiv.org/abs/2006.10369). | |
2 models are available: | |
* [wmt19-de-en-6-6-big](https://huggingface.co/allenai/wmt19-de-en-6-6-big) | |
* [wmt19-de-en-6-6-base](https://huggingface.co/allenai/wmt19-de-en-6-6-base) | |
## Intended uses & limitations | |
#### How to use | |
```python | |
from transformers import FSMTForConditionalGeneration, FSMTTokenizer | |
mname = "allenai/wmt19-de-en-6-6-big" | |
tokenizer = FSMTTokenizer.from_pretrained(mname) | |
model = FSMTForConditionalGeneration.from_pretrained(mname) | |
input = "Maschinelles Lernen ist großartig, nicht wahr?" | |
input_ids = tokenizer.encode(input, return_tensors="pt") | |
outputs = model.generate(input_ids) | |
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True) | |
print(decoded) # Machine learning is great, isn't it? | |
``` | |
#### Limitations and bias | |
## Training data | |
Pretrained weights were left identical to the original model released by allenai. For more details, please, see the [paper](https://arxiv.org/abs/2006.10369). | |
## Eval results | |
Here are the BLEU scores: | |
model | transformers | |
-------|--------- | |
wmt19-de-en-6-6-big | 39.9 | |
The score was calculated using this code: | |
```bash | |
git clone https://github.com/huggingface/transformers | |
cd transformers | |
export PAIR=de-en | |
export DATA_DIR=data/$PAIR | |
export SAVE_DIR=data/$PAIR | |
export BS=8 | |
export NUM_BEAMS=5 | |
mkdir -p $DATA_DIR | |
sacrebleu -t wmt19 -l $PAIR --echo src > $DATA_DIR/val.source | |
sacrebleu -t wmt19 -l $PAIR --echo ref > $DATA_DIR/val.target | |
echo $PAIR | |
PYTHONPATH="src:examples/seq2seq" python examples/seq2seq/run_eval.py allenai/wmt19-de-en-6-6-big $DATA_DIR/val.source $SAVE_DIR/test_translations.txt --reference_path $DATA_DIR/val.target --score_path $SAVE_DIR/test_bleu.json --bs $BS --task translation --num_beams $NUM_BEAMS | |
``` | |
## Data Sources | |
- [training, etc.](http://www.statmt.org/wmt19/) | |
- [test set](http://matrix.statmt.org/test_sets/newstest2019.tgz?1556572561) | |
### BibTeX entry and citation info | |
``` | |
@misc{kasai2020deep, | |
title={Deep Encoder, Shallow Decoder: Reevaluating the Speed-Quality Tradeoff in Machine Translation}, | |
author={Jungo Kasai and Nikolaos Pappas and Hao Peng and James Cross and Noah A. Smith}, | |
year={2020}, | |
eprint={2006.10369}, | |
archivePrefix={arXiv}, | |
primaryClass={cs.CL} | |
} | |
``` | |