Transformers documentation

Tiktoken and interaction with Transformers

You are viewing v4.46.0 version. A newer version v4.47.1 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Tiktoken and interaction with Transformers

Support for tiktoken model files is seamlessly integrated in 🤗 transformers when loading models from_pretrained with a tokenizer.model tiktoken file on the Hub, which is automatically converted into our fast tokenizer.

Known models that were released with a tiktoken.model :

  • gpt2
  • llama3

Example usage

In order to load tiktoken files in transformers, ensure that the tokenizer.model file is a tiktoken file and it will automatically be loaded when loading from_pretrained. Here is how one would load a tokenizer and a model, which can be loaded from the exact same file:

from transformers import AutoTokenizer

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id, subfolder="original") 
< > Update on GitHub