Utilities for Generation¶
This page lists all the utility functions used by generate()
,
greedy_search()
, sample()
,
beam_search()
, beam_sample()
, and
group_beam_search()
.
Most of those are only useful if you are studying the code of the generate methods in the library.
Generate Outputs¶
The output of generate()
is an instance of a subclass of
ModelOutput
. This output is a data structure containing all the information returned
by generate()
, but that can also be used as tuple or dictionary.
Here’s an example:
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
inputs = tokenizer("Hello, my dog is cute and ", return_tensors="pt")
generation_output = model.generate(**inputs, return_dict_in_generate=True, output_scores=True)
The generation_output
object is a GreedySearchDecoderOnlyOutput
, as we can
see in the documentation of that class below, it means it has the following attributes:
sequences
: the generated sequences of tokensscores
(optional): the prediction scores of the language modelling head, for each generation stephidden_states
(optional): the hidden states of the model, for each generation stepattentions
(optional): the attention weights of the model, for each generation step
Here we have the scores
since we passed along output_scores=True
, but we don’t have hidden_states
and
attentions
because we didn’t pass output_hidden_states=True
or output_attentions=True
.
You can access each attribute as you would usually do, and if that attribute has not been returned by the model, you
will get None
. Here for instance generation_output.scores
are all the generated prediction scores of the
language modeling head, and generation_output.attentions
is None
.
When using our generation_output
object as a tuple, it only keeps the attributes that don’t have None
values.
Here, for instance, it has two elements, loss
then logits
, so
generation_output[:2]
will return the tuple (generation_output.sequences, generation_output.scores)
for instance.
When using our generation_output
object as a dictionary, it only keeps the attributes that don’t have None
values. Here, for instance, it has two keys that are sequences
and scores
.
We document here all output types.
GreedySearchOutput¶
SampleOutput¶
BeamSearchOutput¶
BeamSampleOutput¶
LogitsProcessor¶
A LogitsProcessor
can be used to modify the prediction scores of a language model head for
generation.