Transformers documentation
FLAN-T5
This model was released on 2022-10-20 and added to Hugging Face Transformers on 2023-06-20.
FLAN-T5
Overview
FLAN-T5 was released in the paper Scaling Instruction-Finetuned Language Models - it is an enhanced version of T5 that has been finetuned in a mixture of tasks.
One can directly use FLAN-T5 weights without finetuning the model:
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small")
>>> tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-small")
>>> inputs = tokenizer("A step by step recipe to make bolognese pasta:", return_tensors="pt")
>>> outputs = model.generate(**inputs)
>>> print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
['Pour a cup of bolognese into a large bowl and add the pasta']
FLAN-T5 includes the same improvements as T5 version 1.1 (see here for the full details of the model’s improvements.)
Google has released the following variants:
The original checkpoints can be found here.
Update on GitHubRefer to T5’s documentation page for all API reference, code examples and notebooks. For more details regarding training and evaluation of the FLAN-T5, refer to the model card.