Can't follow instructions

#8
by pcomte - opened

Using the BF16 GGUF with BF16 KV cache with 131,072 context on an AMD 7900 XTX 24GB, latest llama.cpp. Getting ~5500 tok/sec prefill and ~150 tok/sec decode. The problem: it can't follow instructions, it's all over the place. It outputs in formats like plaintext, markdown, code blocks inconsistently. I'm very disappointed, i can't recommend this.

Try the latest chat template from the main model.

Sign up or log in to comment