Transformers documentation
Pipelines
Pipelines
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering. See the task summary for examples of use.
There are two categories of pipeline abstractions to be aware about:
- The pipeline() which is the most powerful object encapsulating all other pipelines. 
- The other task-specific pipelines: - AudioClassificationPipeline
- AutomaticSpeechRecognitionPipeline
- ConversationalPipeline
- FeatureExtractionPipeline
- FillMaskPipeline
- ImageClassificationPipeline
- ImageSegmentationPipeline
- ObjectDetectionPipeline
- QuestionAnsweringPipeline
- SummarizationPipeline
- TableQuestionAnsweringPipeline
- TextClassificationPipeline
- TextGenerationPipeline
- Text2TextGenerationPipeline
- TokenClassificationPipeline
- TranslationPipeline
- VisualQuestionAnsweringPipeline
- ZeroShotClassificationPipeline
- ZeroShotImageClassificationPipeline
 
The pipeline abstraction
The pipeline abstraction is a wrapper around all the other available pipelines. It is instantiated as any other pipeline but can provide additional quality of life.
Simple call on one item:
>>> pipe = pipeline("text-classification")
>>> pipe("This restaurant is awesome")
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]If you want to use a specific model from the hub you can ignore the task if the model on the hub already defines it:
>>> pipe = pipeline(model="roberta-large-mnli")
>>> pipe("This restaurant is awesome")
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]To call a pipeline on many items, you can either call with a list.
>>> pipe = pipeline("text-classification")
>>> pipe(["This restaurant is awesome", "This restaurant is aweful"])
[{'label': 'POSITIVE', 'score': 0.9998743534088135},
 {'label': 'NEGATIVE', 'score': 0.9996669292449951}]To iterate of full datasets it is recommended to use a dataset directly. This means you don’t need to allocate
the whole dataset at once, nor do you need to do batching yourself. This should work just as fast as custom loops on
GPU. If it doesn’t don’t hesitate to create an issue.
import datasets
from transformers import pipeline
from transformers.pipelines.pt_utils import KeyDataset
from tqdm.auto import tqdm
pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
dataset = datasets.load_dataset("superb", name="asr", split="test")
# KeyDataset (only *pt*) will simply return the item in the dict returned by the dataset item
# as we're not interested in the *target* part of the dataset.
for out in tqdm(pipe(KeyDataset(dataset, "file"))):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....For ease of use, a generator is also possible:
from transformers import pipeline
pipe = pipeline("text-classification")
def data():
    while True:
        # This could come from a dataset, a database, a queue or HTTP request
        # in a server
        # Caveat: because this is iterative, you cannot use `num_workers > 1` variable
        # to use multiple threads to preprocess data. You can still have 1 thread that
        # does the preprocessing while the main runs the big inference
        yield "This is a test"
for out in pipe(data()):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....transformers.pipeline
< source >( task: str = None model: typing.Optional = None config: typing.Union[str, transformers.configuration_utils.PretrainedConfig, NoneType] = None tokenizer: typing.Union[str, transformers.tokenization_utils.PreTrainedTokenizer, transformers.tokenization_utils_fast.PreTrainedTokenizerFast, NoneType] = None feature_extractor: typing.Union[str, ForwardRef('SequenceFeatureExtractor'), NoneType] = None framework: typing.Optional[str] = None revision: typing.Optional[str] = None use_fast: bool = True use_auth_token: typing.Union[bool, str, NoneType] = None device_map = None torch_dtype = None trust_remote_code: typing.Optional[bool] = None model_kwargs: typing.Dict[str, typing.Any] = None pipeline_class: typing.Optional[typing.Any] = None **kwargs ) → Pipeline
Parameters
- 
							task (str) — The task defining which pipeline will be returned. Currently accepted tasks are:- "audio-classification": will return a AudioClassificationPipeline.
- "automatic-speech-recognition": will return a AutomaticSpeechRecognitionPipeline.
- "conversational": will return a ConversationalPipeline.
- "feature-extraction": will return a FeatureExtractionPipeline.
- "fill-mask": will return a FillMaskPipeline:.
- "image-classification": will return a ImageClassificationPipeline.
- "question-answering": will return a QuestionAnsweringPipeline.
- "table-question-answering": will return a TableQuestionAnsweringPipeline.
- "text2text-generation": will return a Text2TextGenerationPipeline.
- "text-classification"(alias- "sentiment-analysis"available): will return a TextClassificationPipeline.
- "text-generation": will return a TextGenerationPipeline:.
- "token-classification"(alias- "ner"available): will return a TokenClassificationPipeline.
- "translation": will return a TranslationPipeline.
- "translation_xx_to_yy": will return a TranslationPipeline.
- "summarization": will return a SummarizationPipeline.
- "zero-shot-classification": will return a ZeroShotClassificationPipeline.
 
- 
							model (stror PreTrainedModel or TFPreTrainedModel, optional) — The model that will be used by the pipeline to make predictions. This can be a model identifier or an actual instance of a pretrained model inheriting from PreTrainedModel (for PyTorch) or TFPreTrainedModel (for TensorFlow).If not provided, the default for the taskwill be loaded.
- 
							config (stror PretrainedConfig, optional) — The configuration that will be used by the pipeline to instantiate the model. This can be a model identifier or an actual pretrained model configuration inheriting from PretrainedConfig.If not provided, the default configuration file for the requested model will be used. That means that if modelis given, its default configuration will be used. However, ifmodelis not supplied, thistask’s default model’s config is used instead.
- 
							tokenizer (stror PreTrainedTokenizer, optional) — The tokenizer that will be used by the pipeline to encode data for the model. This can be a model identifier or an actual pretrained tokenizer inheriting from PreTrainedTokenizer.If not provided, the default tokenizer for the given modelwill be loaded (if it is a string). Ifmodelis not specified or not a string, then the default tokenizer forconfigis loaded (if it is a string). However, ifconfigis also not given or not a string, then the default tokenizer for the giventaskwill be loaded.
- 
							feature_extractor (strorPreTrainedFeatureExtractor, optional) — The feature extractor that will be used by the pipeline to encode data for the model. This can be a model identifier or an actual pretrained feature extractor inheriting fromPreTrainedFeatureExtractor.Feature extractors are used for non-NLP models, such as Speech or Vision models as well as multi-modal models. Multi-modal models will also require a tokenizer to be passed. If not provided, the default feature extractor for the given modelwill be loaded (if it is a string). Ifmodelis not specified or not a string, then the default feature extractor forconfigis loaded (if it is a string). However, ifconfigis also not given or not a string, then the default feature extractor for the giventaskwill be loaded.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							revision (str, optional, defaults to"main") — When passing a task name or a string model identifier: The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevisioncan be any identifier allowed by git.
- 
							use_fast (bool, optional, defaults toTrue) — Whether or not to use a Fast tokenizer if possible (a PreTrainedTokenizerFast).
- 
							use_auth_token (stror bool, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue, will use the token generated when runningtransformers-cli login(stored in~/.huggingface).
- 
							device_map (strorDict[str, Union[int, str, torch.device], optional) — Sent directly asmodel_kwargs(just a simpler shortcut). Whenacceleratelibrary is present, setdevice_map="auto"to compute the most optimizeddevice_mapautomatically. More informationDo not use device_mapANDdeviceat the same time as they will conflict
- 
							torch_dtype (strortorch.dtype, optional) — Sent directly asmodel_kwargs(just a simpler shortcut) to use the available precision for this model (torch.float16,torch.bfloat16, … or"auto").
- 
							trust_remote_code (bool, optional, defaults toFalse) — Whether or not to allow for custom code defined on the Hub in their own modeling, configuration, tokenization or even pipeline files. This option should only be set toTruefor repositories you trust and in which you have read the code, as it will execute code present on the Hub on your local machine. model_kwargs — Additional dictionary of keyword arguments passed along to the model’sfrom_pretrained(..., **model_kwargs)function. kwargs — Additional keyword arguments passed along to the specific pipeline init (see the documentation for the corresponding pipeline class for possible values).
Returns
A suitable pipeline for the task.
Utility factory method to build a Pipeline.
Pipelines are made of:
- A tokenizer in charge of mapping raw textual input to token.
- A model to make predictions from the inputs.
- Some (optional) post processing for enhancing model’s output.
Examples:
>>> from transformers import pipeline, AutoModelForTokenClassification, AutoTokenizer
>>> # Sentiment analysis pipeline
>>> pipeline("sentiment-analysis")
>>> # Question answering pipeline, specifying the checkpoint identifier
>>> pipeline("question-answering", model="distilbert-base-cased-distilled-squad", tokenizer="bert-base-cased")
>>> # Named entity recognition pipeline, passing in a specific model and tokenizer
>>> model = AutoModelForTokenClassification.from_pretrained("dbmdz/bert-large-cased-finetuned-conll03-english")
>>> tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
>>> pipeline("ner", model=model, tokenizer=tokenizer)Pipeline batching
All pipelines can use batching. This will work
whenever the pipeline uses its streaming ability (so when passing lists or Dataset or generator).
from transformers import pipeline
from transformers.pipelines.pt_utils import KeyDataset
import datasets
dataset = datasets.load_dataset("imdb", name="plain_text", split="unsupervised")
pipe = pipeline("text-classification", device=0)
for out in pipe(KeyDataset(dataset, "text"), batch_size=8, truncation="only_first"):
    print(out)
    # [{'label': 'POSITIVE', 'score': 0.9998743534088135}]
    # Exactly the same output as before, but the content are passed
    # as batches to the modelHowever, this is not automatically a win for performance. It can be either a 10x speedup or 5x slowdown depending on hardware, data and the actual model being used.
Example where it’s mostly a speedup:
from transformers import pipeline
from torch.utils.data import Dataset
from tqdm.auto import tqdm
pipe = pipeline("text-classification", device=0)
class MyDataset(Dataset):
    def __len__(self):
        return 5000
    def __getitem__(self, i):
        return "This is a test"
dataset = MyDataset()
for batch_size in [1, 8, 64, 256]:
    print("-" * 30)
    print(f"Streaming batch_size={batch_size}")
    for out in tqdm(pipe(dataset, batch_size=batch_size), total=len(dataset)):
        pass# On GTX 970
------------------------------
Streaming no batching
100%|██████████████████████████████████████████████████████████████████████| 5000/5000 [00:26<00:00, 187.52it/s]
------------------------------
Streaming batch_size=8
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:04<00:00, 1205.95it/s]
------------------------------
Streaming batch_size=64
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:02<00:00, 2478.24it/s]
------------------------------
Streaming batch_size=256
100%|█████████████████████████████████████████████████████████████████████| 5000/5000 [00:01<00:00, 2554.43it/s]
(diminishing returns, saturated the GPU)Example where it’s most a slowdown:
class MyDataset(Dataset):
    def __len__(self):
        return 5000
    def __getitem__(self, i):
        if i % 64 == 0:
            n = 100
        else:
            n = 1
        return "This is a test" * nThis is a occasional very long sentence compared to the other. In that case, the whole batch will need to be 400 tokens long, so the whole batch will be [64, 400] instead of [64, 4], leading to the high slowdown. Even worse, on bigger batches, the program simply crashes.
------------------------------
Streaming no batching
100%|█████████████████████████████████████████████████████████████████████| 1000/1000 [00:05<00:00, 183.69it/s]
------------------------------
Streaming batch_size=8
100%|█████████████████████████████████████████████████████████████████████| 1000/1000 [00:03<00:00, 265.74it/s]
------------------------------
Streaming batch_size=64
100%|██████████████████████████████████████████████████████████████████████| 1000/1000 [00:26<00:00, 37.80it/s]
------------------------------
Streaming batch_size=256
  0%|                                                                                 | 0/1000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/nicolas/src/transformers/test.py", line 42, in <module>
    for out in tqdm(pipe(dataset, batch_size=256), total=len(dataset)):
....
    q = q / math.sqrt(dim_per_head)  # (bs, n_heads, q_length, dim_per_head)
RuntimeError: CUDA out of memory. Tried to allocate 376.00 MiB (GPU 0; 3.95 GiB total capacity; 1.72 GiB already allocated; 354.88 MiB free; 2.46 GiB reserved in total by PyTorch)There are no good (general) solutions for this problem, and your mileage may vary depending on your use cases. Rule of thumb:
For users, a rule of thumb is:
- Measure performance on your load, with your hardware. Measure, measure, and keep measuring. Real numbers are the only way to go. 
- If you are latency constrained (live product doing inference), don’t batch 
- If you are using CPU, don’t batch. 
- If you are using throughput (you want to run your model on a bunch of static data), on GPU, then: - If you have no clue about the size of the sequence_length (“natural” data), by default don’t batch, measure and try tentatively to add it, add OOM checks to recover when it will fail (and it will at some point if you don’t control the sequence_length.)
- If your sequence_length is super regular, then batching is more likely to be VERY interesting, measure and push it until you get OOMs.
- The larger the GPU the more likely batching is going to be more interesting
 
- As soon as you enable batching, make sure you can handle OOMs nicely. 
Pipeline chunk batching
zero-shot-classification and question-answering are slightly specific in the sense, that a single input might yield
multiple forward pass of a model. Under normal circumstances, this would yield issues with batch_size argument.
In order to circumvent this issue, both of these pipelines are a bit specific, they are ChunkPipeline instead of
regular Pipeline. In short:
preprocessed = pipe.preprocess(inputs) model_outputs = pipe.forward(preprocessed) outputs = pipe.postprocess(model_outputs)
Now becomes:
all_model_outputs = []
for preprocessed in pipe.preprocess(inputs):
    model_outputs = pipe.forward(preprocessed)
    all_model_outputs.append(model_outputs)
outputs = pipe.postprocess(all_model_outputs)This should be very transparent to your code because the pipelines are used in the same way.
This is a simplified view, since the pipeline can handle automatically the batch to ! Meaning you don’t have to care
about how many forward passes you inputs are actually going to trigger, you can optimize the batch_size
independently of the inputs. The caveats from the previous section still apply.
Pipeline custom code
If you want to override a specific pipeline.
Don’t hesitate to create an issue for your task at hand, the goal of the pipeline is to be easy to use and support most
cases, so transformers could maybe support your use case.
If you want to try simply you can:
- Subclass your pipeline of choice
class MyPipeline(TextClassificationPipeline):
    def postprocess():
        # Your code goes here
        scores = scores * 100
        # And here
my_pipeline = MyPipeline(model=model, tokenizer=tokenizer, ...)
# or if you use *pipeline* function, then:
my_pipeline = pipeline(model="xxxx", pipeline_class=MyPipeline)That should enable you to do all the custom code you want.
Implementing a pipeline
The task specific pipelines
AudioClassificationPipeline
class transformers.AudioClassificationPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Audio classification pipeline using any AutoModelForAudioClassification. This pipeline predicts the class of a
raw waveform or an audio file. In case of an audio file, ffmpeg should be installed to support multiple audio
formats.
This pipeline can currently be loaded from pipeline() using the following task identifier:
"audio-classification".
See the list of available models on huggingface.co/models.
__call__
< source >(
			inputs: typing.Union[numpy.ndarray, bytes, str]
				**kwargs
				
			)
			→
				A list of dict with the following keys
Parameters
- 
							inputs (np.ndarrayorbytesorstr) — The inputs is either a raw waveform (np.ndarrayof shape (n, ) of typenp.float32ornp.float64) at the correct sampling rate (no further check will be done) or astrthat is the filename of the audio file, the file will be read at the correct sampling rate to get the waveform using ffmpeg. This requires ffmpeg to be installed on the system. If inputs isbytesit is supposed to be the content of an audio file and is interpreted by ffmpeg in the same way.
- 
							top_k (int, optional, defaults to None) — The number of top labels that will be returned by the pipeline. If the provided number isNoneor higher than the number of labels available in the model configuration, it will default to the number of labels.
Returns
A list of dict with the following keys
- label (str) — The label predicted.
- score (float) — The corresponding probability.
Classify the sequence(s) given as inputs. See the AutomaticSpeechRecognitionPipeline documentation for more information.
AutomaticSpeechRecognitionPipeline
class transformers.AutomaticSpeechRecognitionPipeline
< source >( feature_extractor: typing.Union[ForwardRef('SequenceFeatureExtractor'), str] *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- feature_extractor (SequenceFeatureExtractor) — The feature extractor that will be used by the pipeline to encode waveform for the model.
- 
							chunk_length_s (float, optional, defaults to 0) — The input length for in each chunk. Ifchunk_length_s = 0then chunking is disabled (default). Only available for CTC models, e.g. Wav2Vec2ForCTC.For more information on how to effectively use chunk_length_s, please have a look at the ASR chunking blog post.
- 
							stride_length_s (float, optional, defaults tochunk_length_s / 6) — The length of stride on the left and right of each chunk. Used only withchunk_length_s > 0. This enables the model to see more context and infer letters better than without this context but the pipeline discards the stride bits at the end to make the final reconstitution as perfect as possible.For more information on how to effectively use stride_length_s, please have a look at the ASR chunking blog post.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed. If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of themodel, or to PyTorch if no model is provided.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id.
- 
							decoder (pyctcdecode.BeamSearchDecoderCTC, optional) — PyCTCDecode’s BeamSearchDecoderCTC can be passed for language model boosted decoding. See Wav2Vec2ProcessorWithLM for more information.
Pipeline that aims at extracting spoken text contained within some audio.
The input can be either a raw waveform or a audio file. In case of the audio file, ffmpeg should be installed for to support multiple audio formats
__call__
< source >(
			inputs: typing.Union[numpy.ndarray, bytes, str]
				**kwargs
				
			)
			→
				Dict
Parameters
- 
							inputs (np.ndarrayorbytesorstrordict) — The inputs is either :- strthat is the filename of the audio file, the file will be read at the correct sampling rate to get the waveform using ffmpeg. This requires ffmpeg to be installed on the system.
- bytesit is supposed to be the content of an audio file and is interpreted by ffmpeg in the same way.
- (np.ndarrayof shape (n, ) of typenp.float32ornp.float64) Raw audio at the correct sampling rate (no further check will be done)
- dictform can be used to pass raw audio sampled at arbitrary- sampling_rateand let this pipeline do the resampling. The dict must be in the format- {"sampling_rate": int, "raw": np.array}with optionally a- "stride": (left: int, right: int)than can ask the pipeline to treat the first- leftsamples and last- rightsamples to be ignored in decoding (but used at inference to provide more context to the model). Only use- stridewith CTC models.
 
- 
							return_timestamps (optional, str) — Only available for pure CTC models. If set to"char", the pipeline will returntimestampsalong the text for every character in the text. For instance if you get[{"text": "h", "timestamps": (0.5,0.6), {"text": "i", "timestamps": (0.7, .9)}], then it means the model predicts that the letter “h” was pronounced after0.5and before0.6seconds. If set to"word", the pipeline will returntimestampsalong the text for every word in the text. For instance if you get[{"text": "hi ", "timestamps": (0.5,0.9), {"text": "there", "timestamps": (1.0, .1.5)}], then it means the model predicts that the word “hi” was pronounces before 0.5 and after 0.9 seconds.
Returns
Dict
A dictionary with the following keys:
- text (str) — The recognized text.
- chunks (optional(, List[Dict]) When usingreturn_timestamps, thechunkswill become a list containing all the various text chunks identified by the model, e.g.*[{"text": "hi ", "timestamps": (0.5,0.9), {"text": "there", "timestamps": (1.0, 1.5)}]. The original full text can roughly be recovered by doing"".join(chunk["text"] for chunk in output["chunks"]).
Classify the sequence(s) given as inputs. See the AutomaticSpeechRecognitionPipeline documentation for more information.
ConversationalPipeline
class transformers.Conversation
< source >( text: str = None conversation_id: UUID = None past_user_inputs = None generated_responses = None )
Parameters
- 
							text (str, optional) — The initial user input to start the conversation. If not provided, a user input needs to be provided manually using the add_user_input() method before the conversation can begin.
- 
							conversation_id (uuid.UUID, optional) — Unique identifier for the conversation. If not provided, a random UUID4 id will be assigned to the conversation.
- 
							past_user_inputs (List[str], optional) — Eventual past history of the conversation of the user. You don’t need to pass it manually if you use the pipeline interactively but if you want to recreate history you need to set bothpast_user_inputsandgenerated_responseswith equal length lists of strings
- 
							generated_responses (List[str], optional) — Eventual past history of the conversation of the model. You don’t need to pass it manually if you use the pipeline interactively but if you want to recreate history you need to set bothpast_user_inputsandgenerated_responseswith equal length lists of strings
Utility class containing a conversation and its history. This class is meant to be used as an input to the
ConversationalPipeline. The conversation contains a number of utility function to manage the addition of new
user input and generated model responses. A conversation needs to contain an unprocessed user input before being
passed to the ConversationalPipeline. This user input is either created when the class is instantiated, or by
calling conversational_pipeline.append_response("input") after a conversation turn.
Usage:
conversation = Conversation("Going to the movies tonight - any suggestions?")
# Steps usually performed by the model when generating a response:
# 1. Mark the user input as processed (moved to the history)
conversation.mark_processed()
# 2. Append a mode response
conversation.append_response("The Big lebowski.")
conversation.add_user_input("Is it good?")add_user_input
< source >( text: str overwrite: bool = False )
Add a user input to the conversation for the next round. This populates the internal new_user_input field.
append_response
< source >( response: str )
Append a response to the list of generated responses.
Iterates over all blobs of the conversation.
Returns: Iterator of (is_user, text_chunk) in chronological order of the conversation. is_user is a bool,
text_chunks is a str.
Mark the conversation as processed (moves the content of new_user_input to past_user_inputs) and empties
the new_user_input field.
class transformers.ConversationalPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
- 
							min_length_for_response (int, optional, defaults to 32) — The minimum length (in number of tokens) for a response.
- 
							minimum_tokens (int, optional, defaults to 10) — The minimum length of tokens to leave for a response.
Multi-turn conversational pipeline.
This conversational pipeline can currently be loaded from pipeline() using the following task identifier:
"conversational".
The models that this pipeline can use are models that have been fine-tuned on a multi-turn conversational task, currently: ‘microsoft/DialoGPT-small’, ‘microsoft/DialoGPT-medium’, ‘microsoft/DialoGPT-large’. See the up-to-date list of available models on huggingface.co/models.
Usage:
conversational_pipeline = pipeline("conversational")
conversation_1 = Conversation("Going to the movies tonight - any suggestions?")
conversation_2 = Conversation("What's the last book you have read?")
conversational_pipeline([conversation_1, conversation_2])
conversation_1.add_user_input("Is it an action movie?")
conversation_2.add_user_input("What is the genre of this book?")
conversational_pipeline([conversation_1, conversation_2])__call__
< source >( conversations: typing.Union[transformers.pipelines.conversational.Conversation, typing.List[transformers.pipelines.conversational.Conversation]] num_workers = 0 **kwargs ) → Conversation or a list of Conversation
Parameters
- conversations (a Conversation or a list of Conversation) — Conversations to generate responses for.
- 
							clean_up_tokenization_spaces (bool, optional, defaults toFalse) — Whether or not to clean up the potential extra spaces in the text output. generate_kwargs — Additional keyword arguments to pass along to the generate method of the model (see the generate method corresponding to your framework here).
Returns
Conversation or a list of Conversation
Conversation(s) with updated generated responses for those containing a new user input.
Generate responses for the conversation(s) given as inputs.
FeatureExtractionPipeline
class transformers.FeatureExtractionPipeline
< source >( model: typing.Union[ForwardRef('PreTrainedModel'), ForwardRef('TFPreTrainedModel')] tokenizer: typing.Optional[transformers.tokenization_utils.PreTrainedTokenizer] = None feature_extractor: typing.Optional[ForwardRef('SequenceFeatureExtractor')] = None modelcard: typing.Optional[transformers.modelcard.ModelCard] = None framework: typing.Optional[str] = None task: str = '' args_parser: ArgumentHandler = None device: int = -1 binary_output: bool = False **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id.
Feature extraction pipeline using no model head. This pipeline extracts the hidden states from the base transformer, which can be used as features in downstream tasks.
This feature extraction pipeline can currently be loaded from pipeline() using the task identifier:
"feature-extraction".
All models may be used for this pipeline. See a list of all models, including community-contributed models on huggingface.co/models.
__call__
< source >(
			*args
				**kwargs
				
			)
			→
				A nested list of float
Extract the features of the input(s).
FillMaskPipeline
class transformers.FillMaskPipeline
< source >( model: typing.Union[ForwardRef('PreTrainedModel'), ForwardRef('TFPreTrainedModel')] tokenizer: typing.Optional[transformers.tokenization_utils.PreTrainedTokenizer] = None feature_extractor: typing.Optional[ForwardRef('SequenceFeatureExtractor')] = None modelcard: typing.Optional[transformers.modelcard.ModelCard] = None framework: typing.Optional[str] = None task: str = '' args_parser: ArgumentHandler = None device: int = -1 binary_output: bool = False **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
- 
							top_k (int, defaults to 5) — The number of predictions to return.
- 
							targets (strorList[str], optional) — When passed, the model will limit the scores to the passed targets instead of looking up in the whole vocab. If the provided targets are not in the model vocab, they will be tokenized and the first resulting token will be used (with a warning, and that might be slower).
Masked language modeling prediction pipeline using any ModelWithLMHead. See the masked language modeling
examples for more information.
This mask filling pipeline can currently be loaded from pipeline() using the following task identifier:
"fill-mask".
The models that this pipeline can use are models that have been trained with a masked language modeling objective, which includes the bi-directional models in the library. See the up-to-date list of available models on huggingface.co/models.
This pipeline only works for inputs with exactly one token masked. Experimental: We added support for multiple masks. The returned values are raw model output, and correspond to disjoint probabilities where one might expect joint probabilities (See discussion).
__call__
< source >(
			inputs
				*args
				**kwargs
				
			)
			→
				A list or a list of list of dict
Parameters
- 
							args (strorList[str]) — One or several texts (or one list of prompts) with masked tokens.
- 
							targets (strorList[str], optional) — When passed, the model will limit the scores to the passed targets instead of looking up in the whole vocab. If the provided targets are not in the model vocab, they will be tokenized and the first resulting token will be used (with a warning, and that might be slower).
- 
							top_k (int, optional) — When passed, overrides the number of predictions to return.
Returns
A list or a list of list of dict
Each result comes as list of dictionaries with the following keys:
- sequence (str) — The corresponding input with the mask token prediction.
- score (float) — The corresponding probability.
- token (int) — The predicted token id (to replace the masked one).
- token (str) — The predicted token (to replace the masked one).
Fill the masked token in the text(s) given as inputs.
ImageClassificationPipeline
class transformers.ImageClassificationPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Image classification pipeline using any AutoModelForImageClassification. This pipeline predicts the class of an
image.
This image classification pipeline can currently be loaded from pipeline() using the following task identifier:
"image-classification".
See the list of available models on huggingface.co/models.
__call__
< source >( images: typing.Union[str, typing.List[str], ForwardRef('Image.Image'), typing.List[ForwardRef('Image.Image')]] **kwargs )
Parameters
- 
							images (str,List[str],PIL.ImageorList[PIL.Image]) — The pipeline handles three types of images:- A string containing a http link pointing to an image
- A string containing a local path to an image
- An image loaded in PIL directly
 The pipeline accepts either a single image or a batch of images, which must then be passed as a string. Images in a batch must all be in the same format: all as http links, all as local paths, or all as PIL images. 
- 
							top_k (int, optional, defaults to 5) — The number of top labels that will be returned by the pipeline. If the provided number is higher than the number of labels available in the model configuration, it will default to the number of labels.
Assign labels to the image(s) passed as inputs.
ImageSegmentationPipeline
class transformers.ImageSegmentationPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Image segmentation pipeline using any AutoModelForXXXSegmentation. This pipeline predicts masks of objects and
their classes.
This image segmentation pipeline can currently be loaded from pipeline() using the following task identifier:
"image-segmentation".
See the list of available models on huggingface.co/models.
__call__
< source >( *args **kwargs )
Parameters
- 
							images (str,List[str],PIL.ImageorList[PIL.Image]) — The pipeline handles three types of images:- A string containing an HTTP(S) link pointing to an image
- A string containing a local path to an image
- An image loaded in PIL directly
 The pipeline accepts either a single image or a batch of images. Images in a batch must all be in the same format: all as HTTP(S) links, all as local paths, or all as PIL images. 
- 
							threshold (float, optional, defaults to 0.9) — The probability necessary to make a prediction.
- 
							mask_threshold (float, optional, defaults to 0.5) — Threshold to use when turning the predicted masks into binary values.
Perform segmentation (detect masks & classes) in the image(s) passed as inputs.
NerPipeline
class transformers.TokenClassificationPipeline
< source >( args_parser = <transformers.pipelines.token_classification.TokenClassificationArgumentHandler object at 0x7fd0f3244610> *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
- 
							ignore_labels (List[str], defaults to["O"]) — A list of labels to ignore.
- 
							grouped_entities (bool, optional, defaults toFalse) — DEPRECATED, useaggregation_strategyinstead. Whether or not to group the tokens corresponding to the same entity together in the predictions or not.
- 
							aggregation_strategy (str, optional, defaults to"none") — The strategy to fuse (or not) tokens based on the model prediction.- “none” : Will simply not do any aggregation and simply return raw results from the model
- “simple” : Will attempt to group entities following the default schema. (A, B-TAG), (B, I-TAG), (C, I-TAG), (D, B-TAG2) (E, B-TAG2) will end up being [{“word”: ABC, “entity”: “TAG”}, {“word”: “D”, “entity”: “TAG2”}, {“word”: “E”, “entity”: “TAG2”}] Notice that two consecutive B tags will end up as different entities. On word based languages, we might end up splitting words undesirably : Imagine Microsoft being tagged as [{“word”: “Micro”, “entity”: “ENTERPRISE”}, {“word”: “soft”, “entity”: “NAME”}]. Look for FIRST, MAX, AVERAGE for ways to mitigate that and disambiguate words (on languages that support that meaning, which is basically tokens separated by a space). These mitigations will only work on real words, “New york” might still be tagged with two different entities.
- “first” : (works only on word based models) Will use the SIMPLEstrategy except that words, cannot end up with different tags. Words will simply use the tag of the first token of the word when there is ambiguity.
- “average” : (works only on word based models) Will use the SIMPLEstrategy except that words, cannot end up with different tags. scores will be averaged first across tokens, and then the maximum label is applied.
- “max” : (works only on word based models) Will use the SIMPLEstrategy except that words, cannot end up with different tags. Word entity will simply be the token with the maximum score.
 
Named Entity Recognition pipeline using any ModelForTokenClassification. See the named entity recognition
examples for more information.
This token recognition pipeline can currently be loaded from pipeline() using the following task identifier:
"ner" (for predicting the classes of tokens in a sequence: person, organisation, location or miscellaneous).
The models that this pipeline can use are models that have been fine-tuned on a token classification task. See the up-to-date list of available models on huggingface.co/models.
aggregate_words
< source >( entities: typing.List[dict] aggregation_strategy: AggregationStrategy )
Override tokens from a given word that disagree to force agreement on word boundaries.
Example: micro|soft| com|pany| B-ENT I-NAME I-ENT I-ENT will be rewritten with first strategy as microsoft| company| B-ENT I-ENT
gather_pre_entities
< source >( sentence: str input_ids: ndarray scores: ndarray offset_mapping: typing.Union[typing.List[typing.Tuple[int, int]], NoneType] special_tokens_mask: ndarray aggregation_strategy: AggregationStrategy )
Fuse various numpy arrays into dicts with all the information needed for aggregation
group_entities
< source >( entities: typing.List[dict] )
Find and group together the adjacent tokens with the same entity predicted.
group_sub_entities
< source >( entities: typing.List[dict] )
Group together the adjacent tokens with the same entity predicted.
See TokenClassificationPipeline for all details.
ObjectDetectionPipeline
class transformers.ObjectDetectionPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Object detection pipeline using any AutoModelForObjectDetection. This pipeline predicts bounding boxes of objects
and their classes.
This object detection pipeline can currently be loaded from pipeline() using the following task identifier:
"object-detection".
See the list of available models on huggingface.co/models.
__call__
< source >( *args **kwargs )
Parameters
- 
							images (str,List[str],PIL.ImageorList[PIL.Image]) — The pipeline handles three types of images:- A string containing an HTTP(S) link pointing to an image
- A string containing a local path to an image
- An image loaded in PIL directly
 The pipeline accepts either a single image or a batch of images. Images in a batch must all be in the same format: all as HTTP(S) links, all as local paths, or all as PIL images. 
- 
							threshold (float, optional, defaults to 0.9) — The probability necessary to make a prediction.
Detect objects (bounding boxes & classes) in the image(s) passed as inputs.
QuestionAnsweringPipeline
class transformers.QuestionAnsweringPipeline
< source >( model: typing.Union[ForwardRef('PreTrainedModel'), ForwardRef('TFPreTrainedModel')] tokenizer: PreTrainedTokenizer modelcard: typing.Optional[transformers.modelcard.ModelCard] = None framework: typing.Optional[str] = None device: int = -1 task: str = '' **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Question Answering pipeline using any ModelForQuestionAnswering. See the question answering
examples for more information.
This question answering pipeline can currently be loaded from pipeline() using the following task identifier:
"question-answering".
The models that this pipeline can use are models that have been fine-tuned on a question answering task. See the up-to-date list of available models on huggingface.co/models.
__call__
< source >(
			*args
				**kwargs
				
			)
			→
				A dict or a list of dict
Parameters
- 
							args (SquadExampleor a list ofSquadExample) — One or severalSquadExamplecontaining the question and context.
- 
							X (SquadExampleor a list ofSquadExample, optional) — One or severalSquadExamplecontaining the question and context (will be treated the same way as if passed as the first positional argument).
- 
							data (SquadExampleor a list ofSquadExample, optional) — One or severalSquadExamplecontaining the question and context (will be treated the same way as if passed as the first positional argument).
- 
							question (strorList[str]) — One or several question(s) (must be used in conjunction with thecontextargument).
- 
							context (strorList[str]) — One or several context(s) associated with the question(s) (must be used in conjunction with thequestionargument).
- 
							topk (int, optional, defaults to 1) — The number of answers to return (will be chosen by order of likelihood). Note that we return less than topk answers if there are not enough options available within the context.
- 
							doc_stride (int, optional, defaults to 128) — If the context is too long to fit with the question for the model, it will be split in several chunks with some overlap. This argument controls the size of that overlap.
- 
							max_answer_len (int, optional, defaults to 15) — The maximum length of predicted answers (e.g., only answers with a shorter length are considered).
- 
							max_seq_len (int, optional, defaults to 384) — The maximum length of the total sentence (context + question) in tokens of each chunk passed to the model. The context will be split in several chunks (usingdoc_strideas overlap) if needed.
- 
							max_question_len (int, optional, defaults to 64) — The maximum length of the question after tokenization. It will be truncated if needed.
- 
							handle_impossible_answer (bool, optional, defaults toFalse) — Whether or not we accept impossible as an answer.
Returns
A dict or a list of dict
Each result comes as a dictionary with the following keys:
- score (float) — The probability associated to the answer.
- start (int) — The character start index of the answer (in the tokenized version of the input).
- end (int) — The character end index of the answer (in the tokenized version of the input).
- answer (str) — The answer to the question.
Answer the question(s) given as inputs by using the context(s).
create_sample
< source >(
			question: typing.Union[str, typing.List[str]]
				context: typing.Union[str, typing.List[str]]
				
			)
			→
				One or a list of SquadExample
QuestionAnsweringPipeline leverages the SquadExample internally. This helper method encapsulate all the
logic for converting question(s) and context(s) to SquadExample.
We currently support extractive question answering.
decode
< source >( start: ndarray end: ndarray topk: int max_answer_len: int undesired_tokens: ndarray )
Parameters
- 
							start (np.ndarray) — Individual start probabilities for each token.
- 
							end (np.ndarray) — Individual end probabilities for each token.
- 
							topk (int) — Indicates how many possible answer span(s) to extract from the model output.
- 
							max_answer_len (int) — Maximum size of the answer to extract from the model’s output.
- 
							undesired_tokens (np.ndarray) — Mask determining tokens that can be part of the answer
Take the output of any ModelForQuestionAnswering and will generate probabilities for each span to be the
actual answer.
In addition, it filters out some unwanted/impossible cases like answer len being greater than max_answer_len or answer end position being before the starting position. The method supports output the k-best answer through the topk argument.
span_to_answer
< source >( text: str start: int end: int ) → Dictionary like `{‘answer’
When decoding from token probabilities, this method maps token indexes to actual word in the initial context.
SummarizationPipeline
class transformers.SummarizationPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Summarize news articles and other documents.
This summarizing pipeline can currently be loaded from pipeline() using the following task identifier:
"summarization".
The models that this pipeline can use are models that have been fine-tuned on a summarization task, which is currently, ’bart-large-cnn’, ’t5-small’, ’t5-base’, ’t5-large’, ’t5-3b’, ’t5-11b’. See the up-to-date list of available models on huggingface.co/models.
Usage:
# use bart in pytorch
summarizer = pipeline("summarization")
summarizer("An apple a day, keeps the doctor away", min_length=5, max_length=20)
# use t5 in tf
summarizer = pipeline("summarization", model="t5-base", tokenizer="t5-base", framework="tf")
summarizer("An apple a day, keeps the doctor away", min_length=5, max_length=20)__call__
< source >(
			*args
				**kwargs
				
			)
			→
				A list or a list of list of dict
Parameters
- 
							documents (str or List[str]) — One or several articles (or one list of articles) to summarize.
- 
							return_text (bool, optional, defaults toTrue) — Whether or not to include the decoded texts in the outputs
- 
							return_tensors (bool, optional, defaults toFalse) — Whether or not to include the tensors of predictions (as token indices) in the outputs.
- 
							clean_up_tokenization_spaces (bool, optional, defaults toFalse) — Whether or not to clean up the potential extra spaces in the text output. generate_kwargs — Additional keyword arguments to pass along to the generate method of the model (see the generate method corresponding to your framework here).
Returns
A list or a list of list of dict
Each result comes as a dictionary with the following keys:
- summary_text (str, present whenreturn_text=True) — The summary of the corresponding input.
- summary_token_ids (torch.Tensorortf.Tensor, present whenreturn_tensors=True) — The token ids of the summary.
Summarize the text(s) given as inputs.
TableQuestionAnsweringPipeline
class transformers.TableQuestionAnsweringPipeline
< source >( args_parser = <transformers.pipelines.table_question_answering.TableQuestionAnsweringArgumentHandler object at 0x7fd0f322deb0> *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Table Question Answering pipeline using a ModelForTableQuestionAnswering. This pipeline is only available in
PyTorch.
This tabular question answering pipeline can currently be loaded from pipeline() using the following task
identifier: "table-question-answering".
The models that this pipeline can use are models that have been fine-tuned on a tabular question answering task. See the up-to-date list of available models on huggingface.co/models.
__call__
< source >( *args **kwargs ) → A dictionary or a list of dictionaries containing results
Parameters
- 
							table (pd.DataFrameorDict) — Pandas DataFrame or dictionary that will be converted to a DataFrame containing all the table values. See above for an example of dictionary.
- 
							query (strorList[str]) — Query or list of queries that will be sent to the model alongside the table.
- 
							sequential (bool, optional, defaults toFalse) — Whether to do inference sequentially or as a batch. Batching is faster, but models like SQA require the inference to be done sequentially to extract relations within sequences, given their conversational nature.
- 
							padding (bool,stror PaddingStrategy, optional, defaults toFalse) — Activates and controls padding. Accepts the following values:- Trueor- 'longest': Pad to the longest sequence in the batch (or no padding if only a single sequence if provided).
- 'max_length': Pad to a maximum length specified with the argument- max_lengthor to the maximum acceptable input length for the model if that argument is not provided.
- Falseor- 'do_not_pad'(default): No padding (i.e., can output a batch with sequences of different lengths).
 
- 
							truncation (bool,strorTapasTruncationStrategy, optional, defaults toFalse) — Activates and controls truncation. Accepts the following values:- Trueor- 'drop_rows_to_fit': Truncate to a maximum length specified with the argument- max_lengthor to the maximum acceptable input length for the model if that argument is not provided. This will truncate row by row, removing rows from the table.
- Falseor- 'do_not_truncate'(default): No truncation (i.e., can output batch with sequence lengths greater than the model maximum admissible input size).
 
Returns
A dictionary or a list of dictionaries containing results
Each result is a dictionary with the following keys:
- answer (str) — The answer of the query given the table. If there is an aggregator, the answer will be preceded byAGGREGATOR >.
- coordinates (List[Tuple[int, int]]) — Coordinates of the cells of the answers.
- cells (List[str]) — List of strings made up of the answer cell values.
- aggregator (str) — If the model has an aggregator, this returns the aggregator.
Answers queries according to a table. The pipeline accepts several types of inputs which are detailed below:
- pipeline(table, query)
- pipeline(table, [query])
- pipeline(table=table, query=query)
- pipeline(table=table, query=[query])
- pipeline({"table": table, "query": query})
- pipeline({"table": table, "query": [query]})
- pipeline([{"table": table, "query": query}, {"table": table, "query": query}])
The table argument should be a dict or a DataFrame built from that dict, containing the whole table:
Example:
data = {
    "actors": ["brad pitt", "leonardo di caprio", "george clooney"],
    "age": ["56", "45", "59"],
    "number of movies": ["87", "53", "69"],
    "date of birth": ["7 february 1967", "10 june 1996", "28 november 1967"],
}This dictionary can be passed in as such, or can be converted to a pandas DataFrame:
TextClassificationPipeline
class transformers.TextClassificationPipeline
< source >( **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
- 
							return_all_scores (bool, optional, defaults toFalse) — Whether to return all prediction scores or just the one of the predicted class.
- 
							function_to_apply (str, optional, defaults to"default") — The function to apply to the model outputs in order to retrieve the scores. Accepts four different values:- "default": if the model has a single label, will apply the sigmoid function on the output. If the model has several labels, will apply the softmax function on the output.
- "sigmoid": Applies the sigmoid function on the output.
- "softmax": Applies the softmax function on the output.
- "none": Does not apply any function on the output.
 
Text classification pipeline using any ModelForSequenceClassification. See the sequence classification
examples for more information.
This text classification pipeline can currently be loaded from pipeline() using the following task identifier:
"sentiment-analysis" (for classifying sequences according to positive or negative sentiments).
If multiple classification labels are available (model.config.num_labels >= 2), the pipeline will run a softmax
over the results. If there is a single label, the pipeline will run a sigmoid over the result.
The models that this pipeline can use are models that have been fine-tuned on a sequence classification task. See the up-to-date list of available models on huggingface.co/models.
__call__
< source >(
			*args
				**kwargs
				
			)
			→
				A list or a list of list of dict
Parameters
- 
							args (strorList[str]orDict[str], orList[Dict[str]]) — One or several texts to classify. In order to use text pairs for your classification, you can send a dictionnary containing{"text", "text_pair"}keys, or a list of those.
- 
							top_k (int, optional, defaults to1) — How many results to return.
- 
							function_to_apply (str, optional, defaults to"default") — The function to apply to the model outputs in order to retrieve the scores. Accepts four different values:If this argument is not specified, then it will apply the following functions according to the number of labels: - If the model has a single label, will apply the sigmoid function on the output.
- If the model has several labels, will apply the softmax function on the output.
 Possible values are: - "sigmoid": Applies the sigmoid function on the output.
- "softmax": Applies the softmax function on the output.
- "none": Does not apply any function on the output.
 
Returns
A list or a list of list of dict
Each result comes as list of dictionaries with the following keys:
- label (str) — The label predicted.
- score (float) — The corresponding probability.
If top_k is used, one such dictionary is returned per label.
Classify the text(s) given as inputs.
TextGenerationPipeline
class transformers.TextGenerationPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Language generation pipeline using any ModelWithLMHead. This pipeline predicts the words that will follow a
specified text prompt.
This language generation pipeline can currently be loaded from pipeline() using the following task identifier:
"text-generation".
The models that this pipeline can use are models that have been trained with an autoregressive language modeling objective, which includes the uni-directional models in the library (e.g. gpt2). See the list of available models on huggingface.co/models.
__call__
< source >(
			text_inputs
				**kwargs
				
			)
			→
				A list or a list of list of dict
Parameters
- 
							args (strorList[str]) — One or several prompts (or one list of prompts) to complete.
- 
							return_tensors (bool, optional, defaults toFalse) — Whether or not to include the tensors of predictions (as token indices) in the outputs.
- 
							return_text (bool, optional, defaults toTrue) — Whether or not to include the decoded texts in the outputs.
- 
							return_full_text (bool, optional, defaults toTrue) — If set toFalseonly added text is returned, otherwise the full text is returned Only meaningful if return_text is set to True.
- 
							clean_up_tokenization_spaces (bool, optional, defaults toFalse) — Whether or not to clean up the potential extra spaces in the text output.
- 
							prefix (str, optional) — Prefix added to prompt.
- 
							handle_long_generation (str, optional) — By default, this pipelines does not handle long generation (ones that exceed in one form or the other the model maximum length). There is no perfect way to adress this (more info :https://github.com/huggingface/transformers/issues/14033#issuecomment-948385227). This provides common strategies to work around that problem depending on your use case.- None: default strategy where nothing in particular happens
- "hole": Truncates left of input, and leaves a gap wide enough to let generation happen (might truncate a lot of the prompt and not suitable when generation exceed the model capacity)
 generate_kwargs — Additional keyword arguments to pass along to the generate method of the model (see the generate method corresponding to your framework here). 
Returns
A list or a list of list of dict
Each result comes as a dictionary with the following keys:
- generated_text (str, present whenreturn_text=True) — The generated text.
- generated_token_ids (torch.Tensorortf.Tensor, present whenreturn_tensors=True) — The token ids of the generated text.
Complete the prompt(s) given as inputs.
Text2TextGenerationPipeline
class transformers.Text2TextGenerationPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Pipeline for text to text generation using seq2seq models.
This Text2TextGenerationPipeline pipeline can currently be loaded from pipeline() using the following task
identifier: "text2text-generation".
The models that this pipeline can use are models that have been fine-tuned on a translation task. See the up-to-date list of available models on huggingface.co/models.
Usage:
text2text_generator = pipeline("text2text-generation")
text2text_generator("question: What is 42 ? context: 42 is the answer to life, the universe and everything")__call__
< source >(
			*args
				**kwargs
				
			)
			→
				A list or a list of list of dict
Parameters
- 
							args (strorList[str]) — Input text for the encoder.
- 
							return_tensors (bool, optional, defaults toFalse) — Whether or not to include the tensors of predictions (as token indices) in the outputs.
- 
							return_text (bool, optional, defaults toTrue) — Whether or not to include the decoded texts in the outputs.
- 
							clean_up_tokenization_spaces (bool, optional, defaults toFalse) — Whether or not to clean up the potential extra spaces in the text output.
- 
							truncation (TruncationStrategy, optional, defaults toTruncationStrategy.DO_NOT_TRUNCATE) — The truncation strategy for the tokenization within the pipeline.TruncationStrategy.DO_NOT_TRUNCATE(default) will never truncate, but it is sometimes desirable to truncate the input to fit the model’s max_length instead of throwing an error down the line. generate_kwargs — Additional keyword arguments to pass along to the generate method of the model (see the generate method corresponding to your framework here).
Returns
A list or a list of list of dict
Each result comes as a dictionary with the following keys:
- generated_text (str, present whenreturn_text=True) — The generated text.
- generated_token_ids (torch.Tensorortf.Tensor, present whenreturn_tensors=True) — The token ids of the generated text.
Generate the output text(s) using text(s) given as inputs.
Checks whether there might be something wrong with given input with regard to the model.
TokenClassificationPipeline
class transformers.TokenClassificationPipeline
< source >( args_parser = <transformers.pipelines.token_classification.TokenClassificationArgumentHandler object at 0x7fd0f3244610> *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
- 
							ignore_labels (List[str], defaults to["O"]) — A list of labels to ignore.
- 
							grouped_entities (bool, optional, defaults toFalse) — DEPRECATED, useaggregation_strategyinstead. Whether or not to group the tokens corresponding to the same entity together in the predictions or not.
- 
							aggregation_strategy (str, optional, defaults to"none") — The strategy to fuse (or not) tokens based on the model prediction.- “none” : Will simply not do any aggregation and simply return raw results from the model
- “simple” : Will attempt to group entities following the default schema. (A, B-TAG), (B, I-TAG), (C, I-TAG), (D, B-TAG2) (E, B-TAG2) will end up being [{“word”: ABC, “entity”: “TAG”}, {“word”: “D”, “entity”: “TAG2”}, {“word”: “E”, “entity”: “TAG2”}] Notice that two consecutive B tags will end up as different entities. On word based languages, we might end up splitting words undesirably : Imagine Microsoft being tagged as [{“word”: “Micro”, “entity”: “ENTERPRISE”}, {“word”: “soft”, “entity”: “NAME”}]. Look for FIRST, MAX, AVERAGE for ways to mitigate that and disambiguate words (on languages that support that meaning, which is basically tokens separated by a space). These mitigations will only work on real words, “New york” might still be tagged with two different entities.
- “first” : (works only on word based models) Will use the SIMPLEstrategy except that words, cannot end up with different tags. Words will simply use the tag of the first token of the word when there is ambiguity.
- “average” : (works only on word based models) Will use the SIMPLEstrategy except that words, cannot end up with different tags. scores will be averaged first across tokens, and then the maximum label is applied.
- “max” : (works only on word based models) Will use the SIMPLEstrategy except that words, cannot end up with different tags. Word entity will simply be the token with the maximum score.
 
Named Entity Recognition pipeline using any ModelForTokenClassification. See the named entity recognition
examples for more information.
This token recognition pipeline can currently be loaded from pipeline() using the following task identifier:
"ner" (for predicting the classes of tokens in a sequence: person, organisation, location or miscellaneous).
The models that this pipeline can use are models that have been fine-tuned on a token classification task. See the up-to-date list of available models on huggingface.co/models.
__call__
< source >(
			inputs: typing.Union[str, typing.List[str]]
				**kwargs
				
			)
			→
				A list or a list of list of dict
Parameters
Returns
A list or a list of list of dict
Each result comes as a list of dictionaries (one for each token in the corresponding input, or each entity if this pipeline was instantiated with an aggregation_strategy) with the following keys:
- word (str) — The token/word classified.
- score (float) — The corresponding probability forentity.
- entity (str) — The entity predicted for that token/word (it is named entity_group when aggregation_strategy is not"none".
- index (int, only present whenaggregation_strategy="none") — The index of the corresponding token in the sentence.
- start (int, optional) — The index of the start of the corresponding entity in the sentence. Only exists if the offsets are available within the tokenizer
- end (int, optional) — The index of the end of the corresponding entity in the sentence. Only exists if the offsets are available within the tokenizer
Classify each token of the text(s) given as inputs.
aggregate_words
< source >( entities: typing.List[dict] aggregation_strategy: AggregationStrategy )
Override tokens from a given word that disagree to force agreement on word boundaries.
Example: micro|soft| com|pany| B-ENT I-NAME I-ENT I-ENT will be rewritten with first strategy as microsoft| company| B-ENT I-ENT
gather_pre_entities
< source >( sentence: str input_ids: ndarray scores: ndarray offset_mapping: typing.Union[typing.List[typing.Tuple[int, int]], NoneType] special_tokens_mask: ndarray aggregation_strategy: AggregationStrategy )
Fuse various numpy arrays into dicts with all the information needed for aggregation
group_entities
< source >( entities: typing.List[dict] )
Find and group together the adjacent tokens with the same entity predicted.
group_sub_entities
< source >( entities: typing.List[dict] )
Group together the adjacent tokens with the same entity predicted.
TranslationPipeline
class transformers.TranslationPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Translates from one language to another.
This translation pipeline can currently be loaded from pipeline() using the following task identifier:
"translation_xx_to_yy".
The models that this pipeline can use are models that have been fine-tuned on a translation task. See the up-to-date list of available models on huggingface.co/models.
__call__
< source >(
			*args
				**kwargs
				
			)
			→
				A list or a list of list of dict
Parameters
- 
							args (strorList[str]) — Texts to be translated.
- 
							return_tensors (bool, optional, defaults toFalse) — Whether or not to include the tensors of predictions (as token indices) in the outputs.
- 
							return_text (bool, optional, defaults toTrue) — Whether or not to include the decoded texts in the outputs.
- 
							clean_up_tokenization_spaces (bool, optional, defaults toFalse) — Whether or not to clean up the potential extra spaces in the text output.
- 
							src_lang (str, optional) — The language of the input. Might be required for multilingual models. Will not have any effect for single pair translation models
- 
							tgt_lang (str, optional) — The language of the desired output. Might be required for multilingual models. Will not have any effect for single pair translation models generate_kwargs — Additional keyword arguments to pass along to the generate method of the model (see the generate method corresponding to your framework here).
Returns
A list or a list of list of dict
Each result comes as a dictionary with the following keys:
- translation_text (str, present whenreturn_text=True) — The translation.
- translation_token_ids (torch.Tensorortf.Tensor, present whenreturn_tensors=True) — The token ids of the translation.
Translate the text(s) given as inputs.
VisualQuestionAnsweringPipeline
class transformers.VisualQuestionAnsweringPipeline
< source >( *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Visual Question Answering pipeline using a AutoModelForVisualQuestionAnswering. This pipeline is currently only
available in PyTorch.
This visual question answering pipeline can currently be loaded from pipeline() using the following task
identifiers: "visual-question-answering", "vqa".
The models that this pipeline can use are models that have been fine-tuned on a visual question answering task. See the up-to-date list of available models on huggingface.co/models.
__call__
< source >( image: typing.Union[ForwardRef('Image.Image'), str] question: str = None **kwargs ) → A dictionary or a list of dictionaries containing the result. The dictionaries contain the following keys
Parameters
- 
							image (str,List[str],PIL.ImageorList[PIL.Image]) — The pipeline handles three types of images:- A string containing a http link pointing to an image
- A string containing a local path to an image
- An image loaded in PIL directly
 The pipeline accepts either a single image or a batch of images. If given a single image, it can be broadcasted to multiple questions. 
- 
							question (str,List[str]) — The question(s) asked. If given a single question, it can be broadcasted to multiple images.
- 
							top_k (int, optional, defaults to 5) — The number of top labels that will be returned by the pipeline. If the provided number is higher than the number of labels available in the model configuration, it will default to the number of labels.
Returns
A dictionary or a list of dictionaries containing the result. The dictionaries contain the following keys
- label (str) — The label identified by the model.
- score (int) — The score attributed by the model for that label.
Answers open-ended questions about images. The pipeline accepts several types of inputs which are detailed below:
- pipeline(image=image, question=question)
- pipeline({"image": image, "question": question})
- pipeline([{"image": image, "question": question}])
- pipeline([{"image": image, "question": question}, {"image": image, "question": question}])
ZeroShotClassificationPipeline
class transformers.ZeroShotClassificationPipeline
< source >( args_parser = <transformers.pipelines.zero_shot_classification.ZeroShotClassificationArgumentHandler object at 0x7fd0f3250310> *args **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
NLI-based zero-shot classification pipeline using a ModelForSequenceClassification trained on NLI (natural
language inference) tasks.
Any combination of sequences and labels can be passed and each combination will be posed as a premise/hypothesis pair and passed to the pretrained model. Then, the logit for entailment is taken as the logit for the candidate label being valid. Any NLI model can be used, but the id of the entailment label must be included in the model config’s :attr:~transformers.PretrainedConfig.label2id.
This NLI pipeline can currently be loaded from pipeline() using the following task identifier:
"zero-shot-classification".
The models that this pipeline can use are models that have been fine-tuned on an NLI task. See the up-to-date list of available models on huggingface.co/models.
__call__
< source >(
			sequences: typing.Union[str, typing.List[str]]
				*args
				**kwargs
				
			)
			→
				A dict or a list of dict
Parameters
- 
							sequences (strorList[str]) — The sequence(s) to classify, will be truncated if the model input is too large.
- 
							candidate_labels (strorList[str]) — The set of possible class labels to classify each sequence into. Can be a single label, a string of comma-separated labels, or a list of labels.
- 
							hypothesis_template (str, optional, defaults to"This example is {}.") — The template used to turn each label into an NLI-style hypothesis. This template must include a {} or similar syntax for the candidate label to be inserted into the template. For example, the default template is"This example is {}."With the candidate label"sports", this would be fed into the model like"<cls> sequence to classify <sep> This example is sports . <sep>". The default template works well in many cases, but it may be worthwhile to experiment with different templates depending on the task setting.
- 
							multi_label (bool, optional, defaults toFalse) — Whether or not multiple candidate labels can be true. IfFalse, the scores are normalized such that the sum of the label likelihoods for each sequence is 1. IfTrue, the labels are considered independent and probabilities are normalized for each candidate by doing a softmax of the entailment score vs. the contradiction score.
Returns
A dict or a list of dict
Each result comes as a dictionary with the following keys:
- sequence (str) — The sequence for which this is the output.
- labels (List[str]) — The labels sorted by order of likelihood.
- scores (List[float]) — The probabilities for each of the labels.
Classify the sequence(s) given as inputs. See the ZeroShotClassificationPipeline documentation for more information.
ZeroShotImageClassificationPipeline
class transformers.ZeroShotImageClassificationPipeline
< source >( **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
Zero shot image classification pipeline using CLIPModel. This pipeline predicts the class of an image when you
provide an image and a set of candidate_labels.
This image classification pipeline can currently be loaded from pipeline() using the following task identifier:
"zero-shot-image-classification".
See the list of available models on huggingface.co/models.
__call__
< source >( images: typing.Union[str, typing.List[str], ForwardRef('Image'), typing.List[ForwardRef('Image')]] **kwargs )
Parameters
- 
							images (str,List[str],PIL.ImageorList[PIL.Image]) — The pipeline handles three types of images:- A string containing a http link pointing to an image
- A string containing a local path to an image
- An image loaded in PIL directly
 
- 
							candidate_labels (List[str]) — The candidate labels for this image
- 
							hypothesis_template (str, optional, defaults to"This is a photo of {}") — The sentence used in cunjunction with candidate_labels to attempt the image classification by replacing the placeholder with the candidate_labels. Then likelihood is estimated by using logits_per_image
Assign labels to the image(s) passed as inputs.
	Parent class: Pipeline
class transformers.Pipeline
< source >( model: typing.Union[ForwardRef('PreTrainedModel'), ForwardRef('TFPreTrainedModel')] tokenizer: typing.Optional[transformers.tokenization_utils.PreTrainedTokenizer] = None feature_extractor: typing.Optional[ForwardRef('SequenceFeatureExtractor')] = None modelcard: typing.Optional[transformers.modelcard.ModelCard] = None framework: typing.Optional[str] = None task: str = '' args_parser: ArgumentHandler = None device: int = -1 binary_output: bool = False **kwargs )
Parameters
- model (PreTrainedModel or TFPreTrainedModel) — The model that will be used by the pipeline to make predictions. This needs to be a model inheriting from PreTrainedModel for PyTorch and TFPreTrainedModel for TensorFlow.
- tokenizer (PreTrainedTokenizer) — The tokenizer that will be used by the pipeline to encode data for the model. This object inherits from PreTrainedTokenizer.
- 
							modelcard (strorModelCard, optional) — Model card attributed to the model for this pipeline.
- 
							framework (str, optional) — The framework to use, either"pt"for PyTorch or"tf"for TensorFlow. The specified framework must be installed.If no framework is specified, will default to the one currently installed. If no framework is specified and both frameworks are installed, will default to the framework of the model, or to PyTorch if no model is provided.
- 
							task (str, defaults to"") — A task-identifier for the pipeline.
- 
							num_workers (int, optional, defaults to 8) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the number of workers to be used.
- 
							batch_size (int, optional, defaults to 1) — When the pipeline will use DataLoader (when passing a dataset, on GPU for a Pytorch model), the size of the batch to use, for inference this is not always beneficial, please read Batching with pipelines .
- args_parser (ArgumentHandler, optional) — Reference to the object in charge of parsing supplied pipeline parameters.
- 
							device (int, optional, defaults to -1) — Device ordinal for CPU/GPU supports. Setting this to -1 will leverage CPU, a positive will run the model on the associated CUDA device id. You can pass nativetorch.devicetoo.
- 
							binary_output (bool, optional, defaults toFalse) — Flag indicating if the output the pipeline should happen in a binary format (i.e., pickle) or as raw text.
The Pipeline class is the class from which all pipelines inherit. Refer to this class for methods shared across different pipelines.
Base class implementing pipelined operations. Pipeline workflow is defined as a sequence of the following operations:
Input -> Tokenization -> Model Inference -> Post-Processing (task dependent) -> Output
Pipeline supports running on CPU or GPU through the device argument (see below).
Some pipeline, like for instance FeatureExtractionPipeline ('feature-extraction') output large tensor object
as nested-lists. In order to avoid dumping such large structure as textual data we provide the binary_output
constructor argument. If set to True, the output will be stored in the pickle format.
check_model_type
< source >( supported_models: typing.Union[typing.List[str], dict] )
Check if the model class is in supported by the pipeline.
Context Manager allowing tensor allocation on the user-specified device in framework agnostic way.
ensure_tensor_on_device
< source >(
			**inputs
				
			)
			→
				Dict[str, torch.Tensor]
Ensure PyTorch tensors are on the specified device.
Postprocess will receive the raw outputs of the _forward method, generally tensors, and reformat them into
something more friendly. Generally it will output a list or a dict or results (containing just strings and
numbers).
Scikit / Keras interface to transformers’ pipelines. This method will forward to call().
Preprocess will take the input_ of a specific pipeline and return a dictionnary of everything necessary for
_forward to run properly. It should contain at least one tensor, but might have arbitrary other items.
save_pretrained
< source >( save_directory: str )
Save the pipeline’s model and tokenizer.
Scikit / Keras interface to transformers’ pipelines. This method will forward to call().