Update chat template

#4
by CISCai - opened

I know it's a bit of a pain, but could you update the chat template to the latest chat templates now that llama.cpp supports it?

At least you won't have to requantize everything as I made a handy script that lets you create a new GGUF using the updated tokenizer_config.json file, see the details in the PR. :)

PS: You only have to update the first file in a split GGUF.

Yes that would be awesome.
@CISCai do you have an example call how to use your script exactly?

@James3 With the new script you can create a new GGUF after you downloaded the latest tokenizer_config.json like this:

python gguf-new-metadata.py input.gguf output.gguf --chat-template-config tokenizer_config.json

Ok I did that. Chat Template is now:

llama-cpp-server-1 | {"tid":"134473023250432","timestamp":1714571741,"level":"INFO","function":"main","line":3033,"msg":"chat template","chat_example":"<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>You are a helpful assistant<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>Hi there<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|USER_TOKEN|>How are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>","built_in":false}

looks similar to the previous one if I am not mistaken. Is there any way to make sure the update was successful?

@James3 Right now clients only support the default template, there are however a couple of PRs in progress:
llama.cpp: Refactor chat template API
llama-cpp-python: Support multiple chat templates - step 1

In the meantime you can check the metadata using HFs built-in GGUF inspector or the gguf-dump.py script:

python3 gguf-dump.py input.gguf

You should see a number of new metadata entries; tokenizer.chat_templates (containing tool_use and rag), tokenizer.chat_template.tool_use and tokenizer.chat_template.rag.

Sign up or log in to comment