Help with llama
When i'm launching the model the way you suggested llama-cli -cnv -m ./EXAONE-3.5-7.8B-Instruct-BF16.gguf
-p "You are EXAONE model from LG AI Research, a helpful assistant."
the performance is very POOR. I compared it to https://huggingface.co/spaces/LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct-Demo , and DEMO is much better, it answers all my test questions correctly. But running it locally, all answers are wrong.
So, what am i doing wrong?
(i'm actually running EXAONE-3.5-7.8B-Instruct-Q4_K_M.gguf as i always do, but i know it can't be that bad)
Hello @urtuuuu , Thank you for using EXAONE 3.5!
As I understand, you met poor generation results from EXAONE-3.5-7.8B-Instruct-Q4_K_M.gguf
with the example scripts in the README.
To reproduce your problem, we need to know your generation configuration.
Can you share one of your prompts and sampling configuration? The sampling configuration can be retrieved from the log when you run the example script.
Additionally, please share your wrong answer from Q4_K_M.gguf
& correct answer from Demo
.
To reproduce your problem, we need to know your generation configuration.
Well, i kinda fixed this problem. Generation configuration is just what i saw in DEMO (--temp 0.7 --top_p 0.9 --top_k 1), everything else at default.
What fixed the problem was, instead of -p "You are EXAONE model from LG AI Research, a helpful assistant." i changed it to -p "[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]\n[|user|]Hello!\n[|assistant|]".
No idea why llama.cpp wants me to type the correct promt format inside -p " "... But after i did this, model started to answer my test questions correcty. (reasoning questions)
@urtuuuu , thank you for sharing!
I guess you missed the option of -cnv
which means to run llama.cpp in conversational mode.
In conversational mode, the prompt parameter with option -p
becomes the system prompt. However, if you do the same thing without -cnv
option, the prompt after -p
becomes input as well.
Please refer to the llama-cli section of llama.cpp README for more details.