Utilities for Image Processors
This page lists all the utility functions used by the image processors, mainly the functional transformations used to process the images.
Most of those are only useful if you are studying the code of the image processors in the library.
Image Transformations
transformers.image_transforms.center_crop
< source >(
image: ndarray
size: typing.Tuple[int, int]
data_format: typing.Union[str, transformers.image_utils.ChannelDimension, NoneType] = None
return_numpy: typing.Optional[bool] = None
)
→
np.ndarray
Parameters
-
image (
np.ndarray
) — The image to crop. -
size (
Tuple[int, int]
) — The target size for the cropped image. -
data_format (
str
orChannelDimension
, optional) — The channel dimension format for the output image. Can be one of:"channels_first"
orChannelDimension.FIRST
: image in (num_channels, height, width) format."channels_last"
orChannelDimension.LAST
: image in (height, width, num_channels) format. If unset, will use the inferred format of the input image.
-
return_numpy (
bool
, optional) — Whether or not to return the cropped image as a numpy array. Used for backwards compatibility with the previous ImageFeatureExtractionMixin method.- Unset: will return the same type as the input image.
True
: will return a numpy array.False
: will return aPIL.Image.Image
object.
Returns
np.ndarray
The cropped image.
Crops the image
to the specified size
using a center crop. Note that if the image is too small to be cropped to
the size given, it will be padded (so the returned result will always be of size size
).
transformers.image_transforms.normalize
< source >( image: ndarray mean: typing.Union[float, typing.Iterable[float]] std: typing.Union[float, typing.Iterable[float]] data_format: typing.Optional[transformers.image_utils.ChannelDimension] = None )
Parameters
-
image (
np.ndarray
) — The image to normalize. -
mean (
float
orIterable[float]
) — The mean to use for normalization. -
std (
float
orIterable[float]
) — The standard deviation to use for normalization. -
data_format (
ChannelDimension
, optional) — The channel dimension format of the output image. IfNone
, will use the inferred format from the input.
Normalizes image
using the mean and standard deviation specified by mean
and std
.
image = (image - mean) / std
transformers.rescale
< source >(
image: ndarray
scale: float
data_format: typing.Optional[transformers.image_utils.ChannelDimension] = None
dtype = <class 'numpy.float32'>
)
→
np.ndarray
Parameters
-
image (
np.ndarray
) — The image to rescale. -
scale (
float
) — The scale to use for rescaling the image. -
data_format (
ChannelDimension
, optional) — The channel dimension format of the image. If not provided, it will be the same as the input image. -
dtype (
np.dtype
, optional, defaults tonp.float32
) — The dtype of the output image. Defaults tonp.float32
. Used for backwards compatibility with feature extractors.
Returns
np.ndarray
The rescaled image.
Rescales image
by scale
.
transformers.resize
< source >(
image
size: typing.Tuple[int, int]
resample = <Resampling.BILINEAR: 2>
data_format: typing.Optional[transformers.image_utils.ChannelDimension] = None
return_numpy: bool = True
)
→
np.ndarray
Parameters
-
image (
PIL.Image.Image
ornp.ndarray
ortorch.Tensor
) — The image to resize. -
size (
Tuple[int, int]
) — The size to use for resizing the image. -
resample (
int
, optional, defaults toPIL.Image.Resampling.BILINEAR
) — The filter to user for resampling. -
data_format (
ChannelDimension
, optional) — The channel dimension format of the output image. IfNone
, will use the inferred format from the input. -
return_numpy (
bool
, optional, defaults toTrue
) — Whether or not to return the resized image as a numpy array. If False aPIL.Image.Image
object is returned.
Returns
np.ndarray
The resized image.
Resizes image
to (h, w) specified by size
using the PIL library.
transformers.to_pil_image
< source >(
image: typing.Union[numpy.ndarray, PIL.Image.Image, ForwardRef('torch.Tensor'), ForwardRef('tf.Tensor'), ForwardRef('jnp.Tensor')]
do_rescale: typing.Optional[bool] = None
)
→
PIL.Image.Image
Parameters
-
image (
PIL.Image.Image
ornumpy.ndarray
ortorch.Tensor
ortf.Tensor
) — The image to convert to thePIL.Image
format. -
do_rescale (
bool
, optional) — Whether or not to apply the scaling factor (to make pixel values integers between 0 and 255). Will default toTrue
if the image type is a floating type,False
otherwise.
Returns
PIL.Image.Image
The converted image.
Converts image
to a PIL Image. Optionally rescales it and puts the channel dimension back as the last axis if
needed.
ImageProcessorMixin
This is a feature extraction mixin used to provide saving/loading functionality for sequential and image feature extractors.
from_dict
< source >( feature_extractor_dict: typing.Dict[str, typing.Any] **kwargs ) → FeatureExtractionMixin
Parameters
-
feature_extractor_dict (
Dict[str, Any]
) — Dictionary that will be used to instantiate the feature extractor object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the to_dict() method. -
kwargs (
Dict[str, Any]
) — Additional parameters from which to initialize the feature extractor object.
Returns
The feature extractor object instantiated from those parameters.
Instantiates a type of FeatureExtractionMixin from a Python dictionary of parameters.
from_json_file
< source >( json_file: typing.Union[str, os.PathLike] ) → A feature extractor of type FeatureExtractionMixin
Parameters
Returns
A feature extractor of type FeatureExtractionMixin
The feature_extractor object instantiated from that JSON file.
Instantiates a feature extractor of type FeatureExtractionMixin from the path to a JSON file of parameters.
from_pretrained
< source >( pretrained_model_name_or_path: typing.Union[str, os.PathLike] **kwargs )
Parameters
-
pretrained_model_name_or_path (
str
oros.PathLike
) — This can be either:- a string, the model id of a pretrained feature_extractor hosted inside a model repo on
huggingface.co. Valid model ids can be located at the root-level, like
bert-base-uncased
, or namespaced under a user or organization name, likedbmdz/bert-base-german-cased
. - a path to a directory containing a feature extractor file saved using the
save_pretrained() method, e.g.,
./my_model_directory/
. - a path or url to a saved feature extractor JSON file, e.g.,
./my_model_directory/preprocessor_config.json
.
- a string, the model id of a pretrained feature_extractor hosted inside a model repo on
huggingface.co. Valid model ids can be located at the root-level, like
-
cache_dir (
str
oros.PathLike
, optional) — Path to a directory in which a downloaded pretrained model feature extractor should be cached if the standard cache should not be used. -
force_download (
bool
, optional, defaults toFalse
) — Whether or not to force to (re-)download the feature extractor files and override the cached versions if they exist. -
resume_download (
bool
, optional, defaults toFalse
) — Whether or not to delete incompletely received file. Attempts to resume the download if such a file exists. -
proxies (
Dict[str, str]
, optional) — A dictionary of proxy servers to use by protocol or endpoint, e.g.,{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.
The proxies are used on each request. -
use_auth_token (
str
orbool
, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, or not specified, will use the token generated when runninghuggingface-cli login
(stored in~/.huggingface
). -
revision (
str
, optional, defaults to"main"
) — The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, sorevision
can be any identifier allowed by git.
Instantiate a type of FeatureExtractionMixin from a feature extractor, e.g. a derived class of SequenceFeatureExtractor.
Examples:
# We can't instantiate directly the base class *FeatureExtractionMixin* nor *SequenceFeatureExtractor* so let's show the examples on a
# derived class: *Wav2Vec2FeatureExtractor*
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(
"facebook/wav2vec2-base-960h"
) # Download feature_extraction_config from huggingface.co and cache.
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(
"./test/saved_model/"
) # E.g. feature_extractor (or model) was saved using *save_pretrained('./test/saved_model/')*
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained("./test/saved_model/preprocessor_config.json")
feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(
"facebook/wav2vec2-base-960h", return_attention_mask=False, foo=False
)
assert feature_extractor.return_attention_mask is False
feature_extractor, unused_kwargs = Wav2Vec2FeatureExtractor.from_pretrained(
"facebook/wav2vec2-base-960h", return_attention_mask=False, foo=False, return_unused_kwargs=True
)
assert feature_extractor.return_attention_mask is False
assert unused_kwargs == {"foo": False}
get_feature_extractor_dict
< source >(
pretrained_model_name_or_path: typing.Union[str, os.PathLike]
**kwargs
)
→
Tuple[Dict, Dict]
From a pretrained_model_name_or_path
, resolve to a dictionary of parameters, to be used for instantiating a
feature extractor of type FeatureExtractionMixin using from_dict
.
push_to_hub
< source >( repo_id: str use_temp_dir: typing.Optional[bool] = None commit_message: typing.Optional[str] = None private: typing.Optional[bool] = None use_auth_token: typing.Union[bool, str, NoneType] = None max_shard_size: typing.Union[int, str, NoneType] = '10GB' create_pr: bool = False **deprecated_kwargs )
Parameters
-
repo_id (
str
) — The name of the repository you want to push your feature extractor to. It should contain your organization name when pushing to a given organization. -
use_temp_dir (
bool
, optional) — Whether or not to use a temporary directory to store the files saved before they are pushed to the Hub. Will default toTrue
if there is no directory named likerepo_id
,False
otherwise. -
commit_message (
str
, optional) — Message to commit while pushing. Will default to"Upload feature extractor"
. -
private (
bool
, optional) — Whether or not the repository created should be private (requires a paying subscription). -
use_auth_token (
bool
orstr
, optional) — The token to use as HTTP bearer authorization for remote files. IfTrue
, will use the token generated when runninghuggingface-cli login
(stored in~/.huggingface
). Will default toTrue
ifrepo_url
is not specified. -
max_shard_size (
int
orstr
, optional, defaults to"10GB"
) — Only applicable for models. The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size lower than this size. If expressed as a string, needs to be digits followed by a unit (like"5MB"
). -
create_pr (
bool
, optional, defaults toFalse
) — Whether or not to create a PR with the uploaded files or directly commit.
Upload the feature extractor file to the 🤗 Model Hub while synchronizing a local clone of the repo in
repo_path_or_name
.
Examples:
from transformers import AutoFeatureExtractor
feature extractor = AutoFeatureExtractor.from_pretrained("bert-base-cased")
# Push the feature extractor to your namespace with the name "my-finetuned-bert".
feature extractor.push_to_hub("my-finetuned-bert")
# Push the feature extractor to an organization with the name "my-finetuned-bert".
feature extractor.push_to_hub("huggingface/my-finetuned-bert")
register_for_auto_class
< source >( auto_class = 'AutoFeatureExtractor' )
Register this class with a given auto class. This should only be used for custom feature extractors as the ones
in the library are already mapped with AutoFeatureExtractor
.
This API is experimental and may have some slight breaking changes in the next releases.
save_pretrained
< source >( save_directory: typing.Union[str, os.PathLike] push_to_hub: bool = False **kwargs )
Parameters
-
save_directory (
str
oros.PathLike
) — Directory where the feature extractor JSON file will be saved (will be created if it does not exist). -
push_to_hub (
bool
, optional, defaults toFalse
) — Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the repository you want to push to withrepo_id
(will default to the name ofsave_directory
in your namespace). kwargs — Additional key word arguments passed along to the push_to_hub() method.
Save a feature_extractor object to the directory save_directory
, so that it can be re-loaded using the
from_pretrained() class method.
to_dict
< source >(
)
→
Dict[str, Any]
Returns
Dict[str, Any]
Dictionary of all the attributes that make up this feature extractor instance.
Serializes this instance to a Python dictionary.
to_json_file
< source >( json_file_path: typing.Union[str, os.PathLike] )
Save this instance to a JSON file.
to_json_string
< source >(
)
→
str
Returns
str
String containing all the attributes that make up this feature_extractor instance in JSON format.
Serializes this instance to a JSON string.