What are tokens 32002:32031

by alicecomfy - opened Feb 12

Feb 12

I've been unable to get this to inference for days. The merge doesn't work because of added tokens, I try to use this script to fix that, but it seems there's still a mismatch.

Did you add other tokens besides the chatml ones?
https://github.com/ymcui/Chinese-LLaMA-Alpaca/blob/main/scripts/merge_llama_with_chinese_lora.py

alicecomfy

Feb 12

Merged it by hardcoding, but it seems a bit off, not sure if that's expected or not?
https://github.com/alice-comfy/miqu_hermes_merge/blob/main/merge.py

I am planning on training Senku-70B again with the mistral prompt format, as it also has similar (although less pronounced) prompting weirdness.

152334H

Owner Feb 14

they are blank tokens used for training efficiency. you can nuke them from the embed/lmhead if you want.

152334H changed discussion status to closed Feb 14

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment