lbourdois's picture
Add multilingual to the language tag
e28abfb
|
raw
history blame
3.65 kB
metadata
language:
  - en
  - fr
  - ro
  - de
  - multilingual
license: apache-2.0
tags:
  - summarization
  - translation
datasets:
  - c4

Google's T5

PreTraining

The model was pre-trained on a on a multi-task mixture of unsupervised (1.) and supervised tasks (2.). Thereby, the following datasets were being used for (1.) and (2.):

  1. Datasets used for Unsupervised denoising objective:
  1. Datasets used for Supervised text-to-text language modeling objective

All T5 checkpoints

Other Community Checkpoints: here

Paper

For more information, please take a look at the original paper.

Paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Authors: Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu

Abstract

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new �Colossal Clean Crawled Corpus�, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.

model image