Utilities for pipelines
This page lists all the utility functions the library provides for pipelines.
Most of those are only useful if you are studying the code of the models in the library.
Argument handling
Base interface for handling arguments for each Pipeline.
Handles arguments for zero-shot for text classification by turning each possible label into an NLI premise/hypothesis pair.
QuestionAnsweringPipeline requires the user to provide multiple arguments (i.e. question & context) to be mapped to
internal SquadExample
.
QuestionAnsweringArgumentHandler manages all the possible to create a SquadExample
from the command-line
supplied arguments.
Data format
class transformers.PipelineDataFormat
< source >( output_path: Optional input_path: Optional column: Optional overwrite: bool = False )
Base class for all the pipeline supported data format both for reading and writing. Supported data formats currently includes:
- JSON
- CSV
- stdin/stdout (pipe)
PipelineDataFormat
also includes some utilities to work with multi-columns like mapping from datasets columns to
pipelines keyword arguments through the dataset_kwarg_1=dataset_column_1
format.
from_str
< source >( format: str output_path: Optional input_path: Optional column: Optional overwrite = False ) → PipelineDataFormat
Parameters
- format (
str
) — The format of the desired pipeline. Acceptable values are"json"
,"csv"
or"pipe"
. - output_path (
str
, optional) — Where to save the outgoing data. - input_path (
str
, optional) — Where to look for the input data. - column (
str
, optional) — The column to read. - overwrite (
bool
, optional, defaults toFalse
) — Whether or not to overwrite theoutput_path
.
Returns
The proper data format.
Creates an instance of the right subclass of PipelineDataFormat depending on format
.
Save the provided data object with the representation for the current PipelineDataFormat.
save_binary
< source >( data: Union ) → str
Save the provided data object as a pickle-formatted binary data on the disk.
class transformers.CsvPipelineDataFormat
< source >( output_path: Optional input_path: Optional column: Optional overwrite = False )
Support for pipelines using CSV data format.
Save the provided data object with the representation for the current PipelineDataFormat.
class transformers.JsonPipelineDataFormat
< source >( output_path: Optional input_path: Optional column: Optional overwrite = False )
Support for pipelines using JSON file format.
Save the provided data object in a json file.
class transformers.PipedPipelineDataFormat
< source >( output_path: Optional input_path: Optional column: Optional overwrite: bool = False )
Read data from piped input to the python process. For multi columns data, columns should separated by
If columns are provided, then the output will be a dictionary with {column_x: value_x}
Print the data.
Utilities
class transformers.pipelines.PipelineException
< source >( task: str model: str reason: str )
Raised by a Pipeline when handling call.