FLAN-T5 was released in the paper Scaling Instruction-Finetuned Language Models - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.
One can directly use FLAN-T5 weights without finetuning the model:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small") tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small") inputs = tokenizer("A step by step recipe to make bolognese pasta:", return_tensors="pt") outputs = model.generate(**inputs) print(tokenizer.batch_decode(outputs, skip_special_tokens=True)) ['Pour a cup of bolognese into a large bowl and add the pasta']
FLAN-T5 includes the same improvements as T5 version 1.1 (see here for the full details of the model’s improvements.)
Google has released the following variants:
One can refer to T5’s documentation page for all tips, code examples and notebooks. As well as the FLAN-T5 model card for more details regarding training and evaluation of the model.
The original checkpoints can be found here.