ValueError: Non-consecutive added token '<mask>' found. Should have index 32005 but has index 32004 in saved vocabulary.
#4
by
azeemarshad
- opened
The following code, among other ones in the page give this error. I changed the checkpoint name to "almanach/camembert-large" and yet the same issue. Any idea how to fix this? thank you
from transformers import CamembertModel, CamembertTokenizer
# You can replace "camembert-base" with any other model from the table, e.g. "camembert/camembert-large".
tokenizer = CamembertTokenizer.from_pretrained("camembert/camembert-large")
camembert = CamembertModel.from_pretrained("camembert/camembert-large")
camembert.eval() # disable dropout (or leave in train mode to finetune)
I think it might be an issue with older versions of transformers
, I just tested versions and it starts to break at v4.34
. My quick advice is to upgrade, and if you can't maybe download the model locally and try deleting the added_token.json
file, it should work then