Transformers documentation

Transformers

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v4.52.3).

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

T5

T5 is a encoder-decoder transformer available in a range of sizes from 60M to 11B parameters. It is designed to handle a wide range of NLP tasks by treating them all as text-to-text problems. This eliminates the need for task-specific architectures because T5 converts every NLP task into a text generation task.

To formulate every task as text generation, each task is prepended with a task-specific prefix (e.g., translate English to German: …, summarize: …). This enables T5 to handle tasks like translation, summarization, question answering, and more.

You can find all official T5 checkpoints under the T5 collection.

Click on the T5 models in the right sidebar for more examples of how to apply T5 to different language tasks.

The example below demonstrates how to generate text with Pipeline, AutoModel, and how to translate with T5 from the command line.

Pipeline

AutoModel

transformers CLI

Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the Quantization overview for more available quantization backends.

The example below uses torchao to only quantize the weights to int4.

# pip install torchao
import torch
from transformers import TorchAoConfig, AutoModelForSeq2SeqLM, AutoTokenizer

quantization_config = TorchAoConfig("int4_weight_only", group_size=128)
model = AutoModelForSeq2SeqLM.from_pretrained(
    "google/t5-v1_1-xl",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    quantization_config=quantization_config
)

tokenizer = AutoTokenizer.from_pretrained("google/t5-v1_1-xl")
input_ids = tokenizer("translate English to French: The weather is nice today.", return_tensors="pt").to("cuda")

output = model.generate(**input_ids, cache_implementation="static")
print(tokenizer.decode(output[0], skip_special_tokens=True))

Notes

You can pad the encoder inputs on the left or right because T5 uses relative scalar embeddings.
T5 models need a slightly higher learning rate than the default used in Trainer. Typically, values of 1e-4 and 3e-4 work well for most tasks.

Transformers

T5

Notes

T5Config

class transformers.T5Config

T5Tokenizer

class transformers.T5Tokenizer

build_inputs_with_special_tokens

get_special_tokens_mask

create_token_type_ids_from_sequences

save_vocabulary

T5TokenizerFast

class transformers.T5TokenizerFast

build_inputs_with_special_tokens

create_token_type_ids_from_sequences

T5Model

class transformers.T5Model

forward

T5ForConditionalGeneration

class transformers.T5ForConditionalGeneration

forward

T5EncoderModel

class transformers.T5EncoderModel

forward

T5ForSequenceClassification

class transformers.T5ForSequenceClassification

forward

T5ForTokenClassification

class transformers.T5ForTokenClassification

forward

T5ForQuestionAnswering

class transformers.T5ForQuestionAnswering

forward

TFT5Model

class transformers.TFT5Model

call

TFT5ForConditionalGeneration

class transformers.TFT5ForConditionalGeneration

call

TFT5EncoderModel

class transformers.TFT5EncoderModel

call

FlaxT5Model

class transformers.FlaxT5Model

__call__

encode

decode

FlaxT5ForConditionalGeneration

class transformers.FlaxT5ForConditionalGeneration

__call__

encode

decode

FlaxT5EncoderModel

class transformers.FlaxT5EncoderModel

__call__

call

call

call