Wrong prompt format in tokenizer_config.json?

#7
by wolfram - opened

The chat_template specified in tokenizer_config.json is ChatML, but apparently this model uses the (weird) GPT4 Correct prompt format. Please clarify which is the correct prompt format/chat template and kindly state it on the model card, and make sure tokenizer_config.json also has the proper template. Thank you!

Hi @wolfram thanks for testing this model. I think you used an old GGUF version from TheBloke with the previous wrong tokenizer_config.json. I added an "added_tokens.json" file, maybe this helps.

If not, can you detail what changes I should make? Thanks.

@mlabonne What's the actual chat template? In your tokenizer_config.json, the chat_template is set to ChatML, but the models your mix is made of are using a GPT4 Correct prompt format. How do you prompt it properly?

I used TheBloke's GGUF because the HF version crashed with the error message "RuntimeError: CUDA error: device-side assert triggered". Is that a known issue or just a problem on my end?

Yeah, I managed to make it work with ChatML without any issues but it looks like this depends on your config. There's no pre-defined chat template. As you said, this is a merge of several models that use the GPT4 Correct prompt format, but these tokens are not implemented. I tried a few configs and I'm opting for a modified GPT4 Correct prompt format with a different eos token. I believe it's the best solution but I haven't tested it thoroughly. The CUDA error is also fixed.

Sign up or log in to comment