README.md · qiaoyi/Comment_Summarization4DesignTutor at e28abfb98c1edfbec1e66c353f45fd7f6704e225

metadata

language:
  - en
  - fr
  - ro
  - de
  - multilingual
license: apache-2.0
tags:
  - summarization
  - translation
datasets:
  - c4

Google's T5

PreTraining

The model was pre-trained on a on a multi-task mixture of unsupervised (1.) and supervised tasks (2.). Thereby, the following datasets were being used for (1.) and (2.):

Datasets used for Unsupervised denoising objective:

C4
Wiki-DPR

Datasets used for Supervised text-to-text language modeling objective

Sentence acceptability judgment
- CoLA Warstadt et al., 2018
Sentiment analysis
- SST-2 Socher et al., 2013
Paraphrasing/sentence similarity
- MRPC Dolan and Brockett, 2005
- STS-B Ceret al., 2017
- QQP Iyer et al., 2017
Natural language inference
- MNLI Williams et al., 2017
- QNLI Rajpurkar et al.,2016
- RTE Dagan et al., 2005
- CB De Marneff et al., 2019
Sentence completion
- COPA Roemmele et al., 2011
Word sense disambiguation
- WIC Pilehvar and Camacho-Collados, 2018
Question answering
- MultiRC Khashabi et al., 2018
- ReCoRD Zhang et al., 2018
- BoolQ Clark et al., 2019

All T5 checkpoints

Other Community Checkpoints: here

Paper

For more information, please take a look at the original paper.

Paper: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Authors: Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu

Abstract

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts every language problem into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new �Colossal Clean Crawled Corpus�, we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our dataset, pre-trained models, and code.