Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

You need a custom version of the tokenizers library to use this tokenizer.

To install this custom version you can:

pip install transformers
git clone https://github.com/huggingface/tokenizers.git
cd tokenizers
git checkout bigscience_fork
cd bindings/python
pip install setuptools_rust
pip install -e .

and then to load it, do:

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bigscience-catalogue-data-dev/byte-level-bpe-tokenizer-no-norm-250k-whitespace-and-eos-regex-alpha-v3-dedup-lines-articles")
Downloads last month
0
Unable to determine this model's library. Check the docs .