Tokenizer shape does not match model shape

#13
by lemousehunter - opened

Hi, when I'm trying to shard the model using utils/checkpoint_utils.py, I get a shape mismatch error. More specifically, the model shape (32017) does not match the tokenizer shape (32000). I noticed that in tokenizer.json there are a total of 32000 tokens, and there are an extra 17 tokens specified in added_tokens.json. May I ask how I can resolve this issue?

Sign up or log in to comment