Missing ChatML special tokens

#5
by brlambert - opened

Hey, thanks for this model, the performance is great!

I noticed that the tokenizer config files are missing the ChatML special tokens like im_end - is this intentional?

Owner

Hi @brlambert ,

If you want to use chatml, you can just change the related tokenizer files and it should work fine.

Hi @brlambert ,

If you want to use chatml, you can just change the related tokenizer files and it should work fine.

Nice, I was hoping you'd say that! Thanks!

Weyaxi changed discussion status to closed

So I modified the files but I'm getting this error because of the addition of the <|im_start|> and <|im_end|> tokens
ValueError: Parameter model.embed_tokens.q_weight has shape (32000, 512), but expected (32002, 512)

Owner

Have you changed all tokenizer related files to match chatml?

Yes I believe so, but adding these new special tokens seems to be imcompatible with the model. I tried to fix this by editing the problematic layer shape with model.resize_token_embeddings(32002) and I am able to generate outputs with this new model but it's all gibberish

Sign up or log in to comment