YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model info

This is a BPE tokenizer retrained from scratch on the Wikitext-103 train, evaluation, and test sets. The vocabulary had 28,439 entries.

This tokenizer was used to tokenize text for the GPT-2-like transformer language model trained on Wikitext-103.

Usage

You can download the tokenizer directly from hub as follows:

from transformers import GPT2TokenizerFast

tokenizer = GPT2TokenizerFast.from_pretrained("Kristijan/wikitext-103-tokenizer_v2")

After cloning/downloading the files, you can load the tokenizer using the /from_pretrained() methods as follows:

from transformers import GPT2TokenizerFast

tokenizer = GPT2TokenizerFast.from_pretrained(path_to_folder_with_merges_and_vocab_files)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.