  - en
  - ru
  - multilingual
  - text2text-generation
  - example_title: Question Answering
  - text: >-
      Может ли Джеффри Хинтон поговорить с Джорджем Вашингтоном? Прежде чем
      ответить, дайте обоснование?
  - text: 'Translate to German:  My name is Arthur'
    example_title: Translation
  - text: >-
      Please answer to the following question. Who is going to be the next
      Ballon d'or?
    example_title: Question Answering
  - text: >-
      Q: Can Geoffrey Hinton have a conversation with George Washington? Give
      the rationale before answering.
    example_title: Logical reasoning
  - text: >-
      Please answer the following question. What is the boiling point of
    example_title: Scientific knowledge
  - text: >-
      Answer the following yes/no question. Can you write a whole Haiku in a
      single tweet?
    example_title: Yes/no question
  - text: >-
      Answer the following yes/no question by reasoning step-by-step. Can you
      write a whole Haiku in a single tweet?
    example_title: Reasoning task
  - text: 'Q: ( False or not False or False ) is? A: Let''s think step by step'
    example_title: Boolean Expressions
  - text: >-
      The square root of x is the cube root of y. What is y to the power of 2,
      if x = 4?
    example_title: Math reasoning
  - text: >-
      Premise:  At my age you will probably have learnt one lesson. Hypothesis: 
      It's not certain how many lessons you'll learn by your thirties. Does the
      premise entail the hypothesis?
    example_title: Premise and hypothesis
license: apache-2.0

Model Card for FLAN-T5 base


Table of Contents

  1. TL;DR
  2. Model Details
  3. Usage
  4. Uses
  5. Bias, Risks, and Limitations
  6. Training Details
  7. Evaluation
  8. Environmental Impact
  9. Citation
  10. Model Card Authors




Find below some example scripts on how to use the model in transformers:

Using the Pytorch model

Running the model on a CPU

from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("AlexWortega/Flan_base_translated")
model = T5ForConditionalGeneration.from_pretrained("AlexWortega/Flan_base_translated")

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids

outputs = model.generate(input_ids)

Running the model on a GPU

# pip install accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("AlexWortega/Flan_base_translated")
model = T5ForConditionalGeneration.from_pretrained("AlexWortega/Flan_base_translated", device_map="auto")

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

outputs = model.generate(input_ids)

Running the model on a GPU using different precisions


# pip install accelerate
import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("AlexWortega/Flan_base_translated")
model = T5ForConditionalGeneration.from_pretrained("AlexWortega/Flan_base_translated", device_map="auto", torch_dtype=torch.float16)

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

outputs = model.generate(input_ids)


# pip install bitsandbytes accelerate
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("AlexWortega/Flan_base_translated")
model = T5ForConditionalGeneration.from_pretrained("AlexWortega/Flan_base_translated", device_map="auto", load_in_8bit=True)

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")

outputs = model.generate(input_ids)


Direct Use and Downstream Use

The authors write in the original paper's model card that:

The primary use is research on language models, including: research on zero-shot NLP tasks and in-context few-shot learning NLP tasks, such as reasoning, and question answering; advancing fairness and safety research, and understanding limitations of current large language models

See the research paper for further details.

Out-of-Scope Use

More information needed.

Bias, Risks, and Limitations

The information below in this section are copied from the model's official model card:

Language models, including Flan-T5, can potentially be used for language generation in a harmful way, according to Rae et al. (2021). Flan-T5 should not be used directly in any application, without a prior assessment of safety and fairness concerns specific to the application.

Ethical considerations and risks

Flan-T5 is fine-tuned on a large corpus of text data that was not filtered for explicit content or assessed for existing biases. As a result the model itself is potentially vulnerable to generating equivalently inappropriate content or replicating inherent biases in the underlying data.

Known Limitations

Flan-T5 has not been tested in real world applications.

Sensitive Use:

Flan-T5 should not be applied for any unacceptable use cases, e.g., generation of abusive speech.

Training Details

Training Data

Training Procedure


Testing Data, Factors & Metrics


