MT5

Overview

The mT5 model was presented in mT5: A massively multilingual pre-trained text-to-text transformer by Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.

The abstract from the paper is the following:

The recent “Text-to-Text Transfer Transformer” (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We describe the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. All of the code and model checkpoints

The original code can be found here.

MT5Config

MT5Tokenizer

See T5Tokenizer for all details.

MT5TokenizerFast

See T5TokenizerFast for all details.

MT5Model

MT5ForConditionalGeneration

MT5EncoderModel

TFMT5Model

TFMT5ForConditionalGeneration

TFMT5EncoderModel