Transformers documentation

Quick tour

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Quick tour

Get up and running with 🤗 Transformers! Start using the pipeline() for rapid inference, and quickly load a pretrained model and tokenizer with an AutoClass to solve your text, vision or audio task.

All code examples presented in the documentation have a toggle on the top left for PyTorch and TensorFlow. If not, the code is expected to work for both backends without any change.

Pipeline

pipeline() is the easiest way to use a pretrained model for a given task.

The pipeline() supports many common tasks out-of-the-box:

Text:

  • Sentiment analysis: classify the polarity of a given text.
  • Text generation (in English): generate text from a given input.
  • Name entity recognition (NER): label each word with the entity it represents (person, date, location, etc.).
  • Question answering: extract the answer from the context, given some context and a question.
  • Fill-mask: fill in the blank given a text with masked words.
  • Summarization: generate a summary of a long sequence of text or document.
  • Translation: translate text into another language.
  • Feature extraction: create a tensor representation of the text.

Image:

  • Image classification: classify an image.
  • Image segmentation: classify every pixel in an image.
  • Object detection: detect objects within an image.

Audio:

  • Audio classification: assign a label to a given segment of audio.
  • Automatic speech recognition (ASR): transcribe audio data into text.

For more details about the pipeline() and associated tasks, refer to the documentation here.

Pipeline usage

In the following example, you will use the pipeline() for sentiment analysis.

Install the following dependencies if you haven’t already:

Pytorch
Hide Pytorch content
pip install torch
TensorFlow
Hide TensorFlow content
pip install tensorflow

Import pipeline() and specify the task you want to complete:

>>> from transformers import pipeline

>>> classifier = pipeline("sentiment-analysis")

The pipeline downloads and caches a default pretrained model and tokenizer for sentiment analysis. Now you can use the classifier on your target text:

>>> classifier("We are very happy to show you the 🤗 Transformers library.")
[{'label': 'POSITIVE', 'score': 0.9998}]

For more than one sentence, pass a list of sentences to the pipeline() which returns a list of dictionaries:

>>> results = classifier(["We are very happy to show you the 🤗 Transformers library.", "We hope you don't hate it."])
>>> for result in results:
...     print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
label: POSITIVE, with score: 0.9998
label: NEGATIVE, with score: 0.5309

The pipeline() can also iterate over an entire dataset. Start by installing the 🤗 Datasets library:

pip install datasets 

Create a pipeline() with the task you want to solve for and the model you want to use.

>>> import torch
>>> from transformers import pipeline

>>> speech_recognizer = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h")

Next, load a dataset (see the 🤗 Datasets Quick Start for more details) you’d like to iterate over. For example, let’s load the MInDS-14 dataset:

>>> from datasets import load_dataset, Audio

>>> dataset = load_dataset("PolyAI/minds14", name="en-US", split="train")

We need to make sure that the sampling rate of the dataset matches the sampling rate facebook/wav2vec2-base-960h was trained on.

>>> dataset = dataset.cast_column("audio", Audio(sampling_rate=speech_recognizer.feature_extractor.sampling_rate))

Audio files are automatically loaded and resampled when calling the "audio" column. Let’s extract the raw waveform arrays of the first 4 samples and pass it as a list to the pipeline:

>>> result = speech_recognizer(dataset[:4]["audio"])
>>> print([d["text"] for d in result])
['I WOULD LIKE TO SET UP A JOINT ACCOUNT WITH MY PARTNER HOW DO I PROCEED WITH DOING THAT', "FONDERING HOW I'D SET UP A JOIN TO HET WITH MY WIFE AND WHERE THE AP MIGHT BE", "I I'D LIKE TOY SET UP A JOINT ACCOUNT WITH MY PARTNER I'M NOT SEEING THE OPTION TO DO IT ON THE APSO I CALLED IN TO GET SOME HELP CAN I JUST DO IT OVER THE PHONE WITH YOU AND GIVE YOU THE INFORMATION OR SHOULD I DO IT IN THE AP AND I'M MISSING SOMETHING UQUETTE HAD PREFERRED TO JUST DO IT OVER THE PHONE OF POSSIBLE THINGS", 'HOW DO I TURN A JOIN A COUNT']

For a larger dataset where the inputs are big (like in speech or vision), you will want to pass along a generator instead of a list that loads all the inputs in memory. See the pipeline documentation for more information.

Use another model and tokenizer in the pipeline

The pipeline() can accommodate any model from the Model Hub, making it easy to adapt the pipeline() for other use-cases. For example, if you’d like a model capable of handling French text, use the tags on the Model Hub to filter for an appropriate model. The top filtered result returns a multilingual BERT model fine-tuned for sentiment analysis. Great, let’s use this model!

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
Pytorch
Hide Pytorch content

Use the AutoModelForSequenceClassification and AutoTokenizer to load the pretrained model and it’s associated tokenizer (more on an AutoClass below):

>>> from transformers import AutoTokenizer, AutoModelForSequenceClassification

>>> model = AutoModelForSequenceClassification.from_pretrained(model_name)
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)
TensorFlow
Hide TensorFlow content

Use the TFAutoModelForSequenceClassification and AutoTokenizer to load the pretrained model and it’s associated tokenizer (more on an TFAutoClass below):

>>> from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

>>> model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)

Then you can specify the model and tokenizer in the pipeline(), and apply the classifier on your target text:

>>> classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
>>> classifier("Nous sommes très heureux de vous présenter la bibliothèque 🤗 Transformers.")
[{'label': '5 stars', 'score': 0.7273}]

If you can’t find a model for your use-case, you will need to fine-tune a pretrained model on your data. Take a look at our fine-tuning tutorial to learn how. Finally, after you’ve fine-tuned your pretrained model, please consider sharing it (see tutorial here) with the community on the Model Hub to democratize NLP for everyone! 🤗

AutoClass

Under the hood, the AutoModelForSequenceClassification and AutoTokenizer classes work together to power the pipeline(). An AutoClass is a shortcut that automatically retrieves the architecture of a pretrained model from it’s name or path. You only need to select the appropriate AutoClass for your task and it’s associated tokenizer with AutoTokenizer.

Let’s return to our example and see how you can use the AutoClass to replicate the results of the pipeline().

AutoTokenizer

A tokenizer is responsible for preprocessing text into a format that is understandable to the model. First, the tokenizer will split the text into words called tokens. There are multiple rules that govern the tokenization process, including how to split a word and at what level (learn more about tokenization here). The most important thing to remember though is you need to instantiate the tokenizer with the same model name to ensure you’re using the same tokenization rules a model was pretrained with.

Load a tokenizer with AutoTokenizer:

>>> from transformers import AutoTokenizer

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)

Next, the tokenizer converts the tokens into numbers in order to construct a tensor as input to the model. This is known as the model’s vocabulary.

Pass your text to the tokenizer:

>>> encoding = tokenizer("We are very happy to show you the 🤗 Transformers library.")
>>> print(encoding)
{'input_ids': [101, 11312, 10320, 12495, 19308, 10114, 11391, 10855, 10103, 100, 58263, 13299, 119, 102],
 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

The tokenizer will return a dictionary containing:

Just like the pipeline(), the tokenizer will accept a list of inputs. In addition, the tokenizer can also pad and truncate the text to return a batch with uniform length:

Pytorch
Hide Pytorch content
>>> pt_batch = tokenizer(
...     ["We are very happy to show you the 🤗 Transformers library.", "We hope you don't hate it."],
...     padding=True,
...     truncation=True,
...     max_length=512,
...     return_tensors="pt",
... )
TensorFlow
Hide TensorFlow content
>>> tf_batch = tokenizer(
...     ["We are very happy to show you the 🤗 Transformers library.", "We hope you don't hate it."],
...     padding=True,
...     truncation=True,
...     max_length=512,
...     return_tensors="tf",
... )

Read the preprocessing tutorial for more details about tokenization.

AutoModel

Pytorch
Hide Pytorch content

🤗 Transformers provides a simple and unified way to load pretrained instances. This means you can load an AutoModel like you would load an AutoTokenizer. The only difference is selecting the correct AutoModel for the task. Since you are doing text - or sequence - classification, load AutoModelForSequenceClassification:

>>> from transformers import AutoModelForSequenceClassification

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(model_name)

See the task summary for which AutoModel class to use for which task.

Now you can pass your preprocessed batch of inputs directly to the model. You just have to unpack the dictionary by adding **:

>>> pt_outputs = pt_model(**pt_batch)

The model outputs the final activations in the logits attribute. Apply the softmax function to the logits to retrieve the probabilities:

>>> from torch import nn

>>> pt_predictions = nn.functional.softmax(pt_outputs.logits, dim=-1)
>>> print(pt_predictions)
tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725],
        [0.2084, 0.1826, 0.1969, 0.1755, 0.2365]], grad_fn=<SoftmaxBackward0>)
TensorFlow
Hide TensorFlow content

🤗 Transformers provides a simple and unified way to load pretrained instances. This means you can load an TFAutoModel like you would load an AutoTokenizer. The only difference is selecting the correct TFAutoModel for the task. Since you are doing text - or sequence - classification, load TFAutoModelForSequenceClassification:

>>> from transformers import TFAutoModelForSequenceClassification

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

See the task summary for which AutoModel class to use for which task.

Now you can pass your preprocessed batch of inputs directly to the model by passing the dictionary keys directly to the tensors:

>>> tf_outputs = tf_model(tf_batch)

The model outputs the final activations in the logits attribute. Apply the softmax function to the logits to retrieve the probabilities:

>>> import tensorflow as tf

>>> tf_predictions = tf.nn.softmax(tf_outputs.logits, axis=-1)
>>> tf_predictions

All 🤗 Transformers models (PyTorch or TensorFlow) outputs the tensors before the final activation function (like softmax) because the final activation function is often fused with the loss.

Models are a standard torch.nn.Module or a tf.keras.Model so you can use them in your usual training loop. However, to make things easier, 🤗 Transformers provides a Trainer class for PyTorch that adds functionality for distributed training, mixed precision, and more. For TensorFlow, you can use the fit method from Keras. Refer to the training tutorial for more details.

🤗 Transformers model outputs are special dataclasses so their attributes are autocompleted in an IDE. The model outputs also behave like a tuple or a dictionary (e.g., you can index with an integer, a slice or a string) in which case the attributes that are None are ignored.

Save a model

Pytorch
Hide Pytorch content

Once your model is fine-tuned, you can save it with its tokenizer using PreTrainedModel.save_pretrained():

>>> pt_save_directory = "./pt_save_pretrained"
>>> tokenizer.save_pretrained(pt_save_directory)
>>> pt_model.save_pretrained(pt_save_directory)

When you are ready to use the model again, reload it with PreTrainedModel.from_pretrained():

>>> pt_model = AutoModelForSequenceClassification.from_pretrained("./pt_save_pretrained")
TensorFlow
Hide TensorFlow content

Once your model is fine-tuned, you can save it with its tokenizer using TFPreTrainedModel.save_pretrained():

>>> tf_save_directory = "./tf_save_pretrained"
>>> tokenizer.save_pretrained(tf_save_directory)
>>> tf_model.save_pretrained(tf_save_directory)

When you are ready to use the model again, reload it with TFPreTrainedModel.from_pretrained():

>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained("./tf_save_pretrained")

One particularly cool 🤗 Transformers feature is the ability to save a model and reload it as either a PyTorch or TensorFlow model. The from_pt or from_tf parameter can convert the model from one framework to the other:

Pytorch
Hide Pytorch content
>>> from transformers import AutoModel

>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
TensorFlow
Hide TensorFlow content
>>> from transformers import TFAutoModel

>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)

Custom model builds

You can modify the model’s configuration class to change how a model is built. The configuration specifies a model’s attributes, such as the number of hidden layers or attention heads. You start from scratch when you initialize a model from a custom configuration class. The model attributes are randomly initialized, and you’ll need to train the model before you can use it to get meaningful results.

Start by importing AutoConfig, and then load the pretrained model you want to modify. Within AutoConfig.from_pretrained(), you can specify the attribute you want to change, such as the number of attention heads:

>>> from transformers import AutoConfig

>>> my_config = AutoConfig.from_pretrained("distilbert-base-uncased", n_heads=12)
Pytorch
Hide Pytorch content

Create a model from your custom configuration with AutoModel.from_config():

>>> from transformers import AutoModel

>>> my_model = AutoModel.from_config(my_config)
TensorFlow
Hide TensorFlow content

Create a model from your custom configuration with TFAutoModel.from_config():

>>> from transformers import TFAutoModel

>>> my_model = TFAutoModel.from_config(my_config)

Take a look at the Create a custom architecture guide for more information about building custom configurations.

What's next?

Now that you’ve completed the 🤗 Transformers quick tour, check out our guides and learn how to do more specific things like writing a custom model, fine-tuning a model for a task, and how to train a model with a script. If you’re interested in learning more about 🤗 Transformers core concepts, grab a cup of coffee and take a look at our Conceptual Guides!