You are viewing main version, which requires installation from source. If you'd like
regular pip install, checkout the latest stable version (v4.45.2).
Tiktoken and interaction with Transformers
Support for tiktoken model files is seamlessly integrated in 🤗 transformers when loading models
from_pretrained
with a tokenizer.model
tiktoken file on the Hub, which is automatically converted into our
fast tokenizer.
Known models that were released with a tiktoken.model :
- gpt2
- llama3
Example usage
In order to load tiktoken
files in transformers
, ensure that the tokenizer.model
file is a tiktoken file and it
will automatically be loaded when loading from_pretrained
. Here is how one would load a tokenizer and a model, which
can be loaded from the exact same file:
from transformers import AutoTokenizer
model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id, subfolder="original")