GGUF
Not-For-All-Audiences
nsfw
Inference Endpoints

Llama 3 chat template not working?

#1
by Jdods84 - opened

When I use the Llama 3 chat template for my prompt, Lumimaid seems to repeat responses and loses its mind. When I remove the formatting and use my Alpaca format, it RPs and works fine. I am using KoboldCPP to run this LLM locally. Would it be possible to get some help from an expert with formatting my prompt correctly?

NeverSleep org

It work for me on kobold, I use the one I put on the model card (Llama-3-Instruct)
It also work unquantized, the prompt format is on the tokenizer_config.json

Can you check that and be sure you use the right one? Updated all your tools ?

Sign up or log in to comment