Transformers documentation

TimmWrapper

Transformers

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v4.47.1).

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

TimmWrapper

Overview

Helper class to enable loading timm models to be used with the transformers library and its autoclasses.

>>> import torch
>>> from PIL import Image
>>> from urllib.request import urlopen
>>> from transformers import AutoModelForImageClassification, AutoImageProcessor

>>> # Load image
>>> image = Image.open(urlopen(
...     'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
... ))

>>> # Load model and image processor
>>> checkpoint = "timm/resnet50.a1_in1k"
>>> image_processor = AutoImageProcessor.from_pretrained(checkpoint)
>>> model = AutoModelForImageClassification.from_pretrained(checkpoint).eval()

>>> # Preprocess image
>>> inputs = image_processor(image)

>>> # Forward pass
>>> with torch.no_grad():
...     logits = model(**inputs).logits

>>> # Get top 5 predictions
>>> top5_probabilities, top5_class_indices = torch.topk(logits.softmax(dim=1) * 100, k=5)

TimmWrapperConfig

class transformers.TimmWrapperConfig

< source >

( initializer_range: float = 0.02 do_pooling: bool = True **kwargs )

Parameters

initializer_range (float, optional, defaults to 0.02) — The standard deviation of the truncated_normal_initializer for initializing all weight matrices.
do_pooling (bool, optional, defaults to True) — Whether to do pooling for the last_hidden_state in TimmWrapperModel or not.

This is the configuration class to store the configuration for a timm backbone TimmWrapper.

It is used to instantiate a timm model according to the specified arguments, defining the model.

Configuration objects inherit from PretrainedConfig and can be used to control the model outputs. Read the documentation from PretrainedConfig for more information.

Example:

>>> from transformers import TimmWrapperModel

>>> # Initializing a timm model
>>> model = TimmWrapperModel.from_pretrained("timm/resnet18.a1_in1k")

>>> # Accessing the model configuration
>>> configuration = model.config

TimmWrapperImageProcessor

class transformers.TimmWrapperImageProcessor

< source >

( pretrained_cfg: typing.Dict[str, typing.Any] architecture: typing.Optional[str] = None **kwargs )

Parameters

pretrained_cfg (Dict[str, Any]) — The configuration of the pretrained model used to resolve evaluation and training transforms.
architecture (Optional[str], optional) — Name of the architecture of the model.

Wrapper class for timm models to be used within transformers.

preprocess

< source >

( images: typing.Union[ForwardRef('PIL.Image.Image'), numpy.ndarray, ForwardRef('torch.Tensor'), typing.List[ForwardRef('PIL.Image.Image')], typing.List[numpy.ndarray], typing.List[ForwardRef('torch.Tensor')]] return_tensors: typing.Union[str, transformers.utils.generic.TensorType, NoneType] = 'pt' )

Parameters

images (ImageInput) — Image to preprocess. Expects a single or batch of images
return_tensors (str or TensorType, optional) — The type of tensors to return.

Preprocess an image or batch of images.

TimmWrapperModel

class transformers.TimmWrapperModel

< source >

( config: TimmWrapperConfig )

Wrapper class for timm models to be used in transformers.

forward

< source >

( pixel_values: FloatTensor output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Union[bool, typing.List[int], NoneType] = None return_dict: typing.Optional[bool] = None do_pooling: typing.Optional[bool] = None **kwargs ) → transformers.models.timm_wrapper.modeling_timm_wrapper.TimmWrapperModelOutput or tuple(torch.FloatTensor)

Parameters

pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) — Pixel values. Pixel values can be obtained using AutoImageProcessor. See TimmWrapperImageProcessor.preprocess() for details.
output_attentions (bool, optional) — Whether or not to return the attentions tensors of all attention layers. Not compatible with timm wrapped models.
output_hidden_states (bool, optional) — Whether or not to return the hidden states of all layers. Not compatible with timm wrapped models.
return_dict (bool, optional) — Whether or not to return a ModelOutput instead of a plain tuple.
**kwargs — Additional keyword arguments passed along to the timm model forward.
do_pooling (bool, optional) — Whether to do pooling for the last_hidden_state in TimmWrapperModel or not. If None is passed, the do_pooling value from the config is used.

Returns

transformers.models.timm_wrapper.modeling_timm_wrapper.TimmWrapperModelOutput or tuple(torch.FloatTensor)

A transformers.models.timm_wrapper.modeling_timm_wrapper.TimmWrapperModelOutput or a tuple of torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (<class 'transformers.models.timm_wrapper.configuration_timm_wrapper.TimmWrapperConfig'>) and inputs.

last_hidden_state (torch.FloatTensor) — The last hidden state of the model, output before applying the classification head.
pooler_output (torch.FloatTensor, optional) — The pooled output derived from the last hidden state, if applicable.
hidden_states (tuple(torch.FloatTensor), optional) — A tuple containing the intermediate hidden states of the model at the output of each layer or specified layers. Returned if output_hidden_states=True is set or if config.output_hidden_states=True.
attentions (tuple(torch.FloatTensor), optional) — A tuple containing the intermediate attention weights of the model at the output of each layer. Returned if output_attentions=True is set or if config.output_attentions=True. Note: Currently, Timm models do not support attentions output.

The TimmWrapperModel forward method, overrides the __call__ special method.

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.

Examples:

>>> import torch
>>> from PIL import Image
>>> from urllib.request import urlopen
>>> from transformers import AutoModel, AutoImageProcessor

>>> # Load image
>>> image = Image.open(urlopen(
...     'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
... ))

>>> # Load model and image processor
>>> checkpoint = "timm/resnet50.a1_in1k"
>>> image_processor = AutoImageProcessor.from_pretrained(checkpoint)
>>> model = AutoModel.from_pretrained(checkpoint).eval()

>>> # Preprocess image
>>> inputs = image_processor(image)

>>> # Forward pass
>>> with torch.no_grad():
...     outputs = model(**inputs)

>>> # Get pooled output
>>> pooled_output = outputs.pooler_output

>>> # Get last hidden state
>>> last_hidden_state = outputs.last_hidden_state

TimmWrapperForImageClassification

class transformers.TimmWrapperForImageClassification

< source >

( config: TimmWrapperConfig )

Wrapper class for timm models to be used in transformers for image classification.

forward

< source >

( pixel_values: FloatTensor labels: typing.Optional[torch.LongTensor] = None output_attentions: typing.Optional[bool] = None output_hidden_states: typing.Union[bool, typing.List[int], NoneType] = None return_dict: typing.Optional[bool] = None **kwargs ) → transformers.modeling_outputs.ImageClassifierOutput or tuple(torch.FloatTensor)

Parameters

pixel_values (torch.FloatTensor of shape (batch_size, num_channels, height, width)) — Pixel values. Pixel values can be obtained using AutoImageProcessor. See TimmWrapperImageProcessor.preprocess() for details.
output_attentions (bool, optional) — Whether or not to return the attentions tensors of all attention layers. Not compatible with timm wrapped models.
output_hidden_states (bool, optional) — Whether or not to return the hidden states of all layers. Not compatible with timm wrapped models.
return_dict (bool, optional) — Whether or not to return a ModelOutput instead of a plain tuple.
**kwargs — Additional keyword arguments passed along to the timm model forward.
labels (torch.LongTensor of shape (batch_size,), optional) — Labels for computing the image classification/regression loss. Indices should be in [0, ..., config.num_labels - 1]. If config.num_labels == 1 a regression loss is computed (Mean-Square loss), If config.num_labels > 1 a classification loss is computed (Cross-Entropy).

Returns

transformers.modeling_outputs.ImageClassifierOutput or tuple(torch.FloatTensor)

A transformers.modeling_outputs.ImageClassifierOutput or a tuple of torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration (<class 'transformers.models.timm_wrapper.configuration_timm_wrapper.TimmWrapperConfig'>) and inputs.

loss (torch.FloatTensor of shape (1,), optional, returned when labels is provided) — Classification (or regression if config.num_labels==1) loss.
logits (torch.FloatTensor of shape (batch_size, config.num_labels)) — Classification (or regression if config.num_labels==1) scores (before SoftMax).
hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) — Tuple of torch.FloatTensor (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each stage) of shape (batch_size, sequence_length, hidden_size). Hidden-states (also called feature maps) of the model at the output of each stage.
attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) — Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, patch_size, sequence_length).

Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.

The TimmWrapperForImageClassification forward method, overrides the __call__ special method.

Examples:

>>> import torch
>>> from PIL import Image
>>> from urllib.request import urlopen
>>> from transformers import AutoModelForImageClassification, AutoImageProcessor

>>> # Load image
>>> image = Image.open(urlopen(
...     'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
... ))

>>> # Load model and image processor
>>> checkpoint = "timm/resnet50.a1_in1k"
>>> image_processor = AutoImageProcessor.from_pretrained(checkpoint)
>>> model = AutoModelForImageClassification.from_pretrained(checkpoint).eval()

>>> # Preprocess image
>>> inputs = image_processor(image)

>>> # Forward pass
>>> with torch.no_grad():
...     logits = model(**inputs).logits

>>> # Get top 5 predictions
>>> top5_probabilities, top5_class_indices = torch.topk(logits.softmax(dim=1) * 100, k=5)

< > Update on GitHub

←Table Transformer UperNet→