Having trouble loading this with transformers

#8
by codelion - opened

I am getting the following exception when trying to load the tokenizer

tokenizer = AutoTokenizer.from_pretrained(model_id)

Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 12564 column 3

I can't reproduce this. Try upgrading transformers and tokenizers.

My environment: transformers==4.38.2 and tokenizers==0.15.2.

Using transformers==4.38.2 worked but it seems to be broken with transformers@main which is what I was using. Thanks.

codelion changed discussion status to closed

Met the exact same Exception when using transformers==4.40.1 and tokenizers==0.19.1. But no such problem with running other Qwen models (Qwen1.5-0.5B-Chat, Qwen-7B-Chat).

image.png

After this commit now it should work!

Sign up or log in to comment