Fix the tokenizer class

#53
by sgugger - opened

The tokenizer class is the wrong one, so the tokenizer is instantiated as PreTrainedTokenizerFast and returns token_type_ids when used on an input. The model cannot handle those, so the pipeline crashes for this model.

BigScience Workshop org

Thanks for fixing it!

Note for other readers: @ybelkada also did this change in smaller checkpoint repos (https://huggingface.co/bigscience/bloom-1b3/discussions/8)

lysandre changed pull request status to merged

Sign up or log in to comment