Help with llama

by urtuuuu - opened Dec 9, 2024

Dec 9, 2024

•

edited Dec 9, 2024

When i'm launching the model the way you suggested llama-cli -cnv -m ./EXAONE-3.5-7.8B-Instruct-BF16.gguf
-p "You are EXAONE model from LG AI Research, a helpful assistant."
the performance is very POOR. I compared it to https://huggingface.co/spaces/LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct-Demo , and DEMO is much better, it answers all my test questions correctly. But running it locally, all answers are wrong.
So, what am i doing wrong?
(i'm actually running EXAONE-3.5-7.8B-Instruct-Q4_K_M.gguf as i always do, but i know it can't be that bad)

nuxlear

LG AI Research org Dec 10, 2024

•

edited Dec 10, 2024

Hello @urtuuuu , Thank you for using EXAONE 3.5!

As I understand, you met poor generation results from EXAONE-3.5-7.8B-Instruct-Q4_K_M.gguf with the example scripts in the README.

To reproduce your problem, we need to know your generation configuration.
Can you share one of your prompts and sampling configuration? The sampling configuration can be retrieved from the log when you run the example script.
Additionally, please share your wrong answer from Q4_K_M.gguf & correct answer from Demo.

urtuuuu

Dec 10, 2024

•

edited Dec 10, 2024

To reproduce your problem, we need to know your generation configuration.

Well, i kinda fixed this problem. Generation configuration is just what i saw in DEMO (--temp 0.7 --top_p 0.9 --top_k 1), everything else at default.
What fixed the problem was, instead of -p "You are EXAONE model from LG AI Research, a helpful assistant." i changed it to -p "[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]\n[|user|]Hello!\n[|assistant|]".
No idea why llama.cpp wants me to type the correct promt format inside -p " "... But after i did this, model started to answer my test questions correcty. (reasoning questions)

nuxlear

LG AI Research org Dec 11, 2024

@urtuuuu , thank you for sharing!

I guess you missed the option of -cnv which means to run llama.cpp in conversational mode.

In conversational mode, the prompt parameter with option -p becomes the system prompt. However, if you do the same thing without -cnv option, the prompt after -p becomes input as well.

Please refer to the llama-cli section of llama.cpp README for more details.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment