What are tokens 32002:32031

#1
by alicecomfy - opened

I've been unable to get this to inference for days. The merge doesn't work because of added tokens, I try to use this script to fix that, but it seems there's still a mismatch.

Did you add other tokens besides the chatml ones?
https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py

Merged it by hardcoding, but it seems a bit off, not sure if that's expected or not?
https://github.com/alice-comfy/miqu_hermes_merge/blob/main/merge.py

I am planning on training Senku-70B again with the mistral prompt format, as it also has similar (although less pronounced) prompting weirdness.

Owner

they are blank tokens used for training efficiency. you can nuke them from the embed/lmhead if you want.

152334H changed discussion status to closed

Sign up or log in to comment