arnastofnun/wmt24-en-is-transformer-base-deep

Model description

This is a translation model which translates text from English to Icelandic. It follows the architecture of the transformer model described in Attention is All You Need and was trained with fairseq for WMT24.

This is the base version of our model. See also: wmt24-en-is-transformer-base, wmt24-en-is-transformer-big, wmt24-en-is-transformer-big-deep.

model	d_model	d_ff	h	N_enc	N_dec
Base	512	2048	8	6	6
Base_deep	512	2048	8	36	12
Big	1024	4096	16	6	6
Big_deep	1024	4096	16	36	12

How to use

from fairseq.models.transformer import TransformerModel
TRANSLATION_MODEL_NAME = 'checkpoint_best.pt'
TRANSLATION_MODEL = TransformerModel.from_pretrained('path/to/model', checkpoint_file=TRANSLATION_MODEL_NAME, bpe='sentencepiece', sentencepiece_model='sentencepiece.bpe.model')
src_sentences = ['This is a test sentence.', 'This is another test sentence.']
translated_sentences = TRANSLATION_MODEL.translate(src_sentences)
print(translated_sentences)

Eval results

We evaluated our data on the WMT21 test set. These are the chrF scores for our published models:

model	chrF
Base	56.8
Base_deep	57.1
Big	57.7
Big_deep	57.7

BibTeX entry and citation info

@inproceedings{jasonarson2024cogsinamachine,
    year={2024},
    title={Cogs in a Machine, Doing What They’re Meant to Do \\– The AMI Submission to the WMT24 General Translation Task},
    author={Atli Jasonarson, Hinrik Hafsteinsson, Bjarki Ármannsson, Steinþór Steingrímsson},
    organization={The Árni Magnússon Institute for Icelandic Studies}
}