Transformers documentation

Load pretrained instances with an AutoClass

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Load pretrained instances with an AutoClass

With so many different Transformer architectures, it can be challenging to create one for your checkpoint. As a part of 🤗 Transformers core philosophy to make the library easy, simple and flexible to use, an AutoClass automatically infer and load the correct architecture from a given checkpoint. The from_pretrained method lets you quickly load a pretrained model for any architecture so you don’t have to devote time and resources to train a model from scratch. Producing this type of checkpoint-agnostic code means if your code works for one checkpoint, it will work with another checkpoint - as long as it was trained for a similar task - even if the architecture is different.

Remember, architecture refers to the skeleton of the model and checkpoints are the weights for a given architecture. For example, BERT is an architecture, while bert-base-uncased is a checkpoint. Model is a general term that can mean either architecture or checkpoint.

In this tutorial, learn to:

  • Load a pretrained tokenizer.
  • Load a pretrained feature extractor.
  • Load a pretrained processor.
  • Load a pretrained model.


Nearly every NLP task begins with a tokenizer. A tokenizer converts your input into a format that can be processed by the model.

Load a tokenizer with AutoTokenizer.from_pretrained():

>>> from transformers import AutoTokenizer

>>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

Then tokenize your input as shown below:

>>> sequence = "In a hole in the ground there lived a hobbit."
>>> print(tokenizer(sequence))
{'input_ids': [101, 1999, 1037, 4920, 1999, 1996, 2598, 2045, 2973, 1037, 7570, 10322, 4183, 1012, 102], 
 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}


For audio and vision tasks, a feature extractor processes the audio signal or image into the correct input format.

Load a feature extractor with AutoFeatureExtractor.from_pretrained():

>>> from transformers import AutoFeatureExtractor

>>> feature_extractor = AutoFeatureExtractor.from_pretrained(
...     "ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition"
... )


Multimodal tasks require a processor that combines two types of preprocessing tools. For example, the LayoutLMV2 model requires a feature extractor to handle images and a tokenizer to handle text; a processor combines both of them.

Load a processor with AutoProcessor.from_pretrained():

>>> from transformers import AutoProcessor

>>> processor = AutoProcessor.from_pretrained("microsoft/layoutlmv2-base-uncased")


Finally, the AutoModelFor classes let you load a pretrained model for a given task (see here for a complete list of available tasks). For example, load a model for sequence classification with AutoModelForSequenceClassification.from_pretrained()

>>> from transformers import AutoModelForSequenceClassification >>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")

Easily reuse the same checkpoint to load an architecture for a different task:

>>> from transformers import AutoModelForTokenClassification >>> model = AutoModelForTokenClassification.from_pretrained("distilbert-base-uncased")

Generally, we recommend using the AutoTokenizer class and the AutoModelFor class to load pretrained instances of models. This will ensure you load the correct architecture every time. In the next tutorial, learn how to use your newly loaded tokenizer, feature extractor and processor to preprocess a dataset for fine-tuning.