Even this excellent high-end model doesn't follow my instructions

#8
by alexcardo - opened

Maybe someone can suggest me how to properly instruct the model to follow my instructions correctly?

I ask the model to write an outline for the article titled "Here is the title".

Next, I ask the model to write a detailed article using this outline. No one model managed to cope with this task even this one. Every model omit the subheadings created by itself. Why it's too difficult for the modern model to follow this simple instruction?

Also, I try to ask the model to markdown the article. And this one, for example, do it like this ===== and --------- instead of # and ##. Even a minor 4K Mistral is capable of coping with this issue.

I'm completely frustrated. Every model omit the subheadings in the outline or rewrite them. Any my suggestions to strictly follow the subheadings structure give me no results.

I can't believe that such a high-end model can't do such a simple thing.

I would appreciate for any suggestion.

Thank you!

Edited: false alarm with q6

I'm using a Q4_K_M on CPU on my dedicated sever. Q6 is too big for my server's 32GB RAM. On my MacBook M1 I have only 8GB. So, Q4_K_M is the only way for me to run the model.

The only thing I want is to force the model follow my instruction as it is. If I ask the model to write something using the outline provided, I don't want it to omit subheadings. People claim that this model surpass GPT3.5 and even GPT4. Ok, I understand that the quantized model lost its quality. However, as for me, this is the key point the model should adhere, ... to follow the instructions.

Meanwhile, GPT3.5 strictly follow these instructions.

P.s.: I don't use Chat GPT since the release of llama.cpp... But to clarify, ChatGPT really follow the instructions provided (when it comes to subheadings at least).

P.p.s: By the way. Model is excellent. When I will force it to do what I want, I'll forget ChatGPT even for urgent purposes.

@alexcardo theres a few things you could do,
now im not sure if it would exactly work but you can use Grammar in llama cpp to force the model to output in a specific format like json or chess. You can make your own custom grammar file so it will follow whatever your format is.

making the grammar file might be a bit tricky and might not even work so you could try doing few shot
you just need to provide a few examples(even 1 will work) and it should perform much better.

@alexcardo theres a few things you could do,
now im not sure if it would exactly work but you can use Grammar in llama cpp to force the model to output in a specific format like json or chess. You can make your own custom grammar file so it will follow whatever your format is.

making the grammar file might be a bit tricky and might not even work so you could try doing few shot
you just need to provide a few examples(even 1 will work) and it should perform much better.

Thank you so much for this advice. I knew nothing about the Grammar usage. I will Google this subject. Unfortunately, there is too few data on the subject of local LLMs and Llama CPP. I would appreciate if you uncover the subject a bit more!

Thank you!

basically grammar is a notation that describes the valid syntax of text.

Llama.cpp itself has examples of them you can check out, but lets say you basically can force the model to output in a certain format.
Here are some examples such as json and chess piecees and also much more information
https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md

Sign up or log in to comment