llama.cpp-b2234 generated not good output

by wyklq - opened Feb 22, 2024

Feb 22, 2024

It looks like the GGUF file was not compatible with llama.cpp-b2234 release.
I tried "gemma-7b-it-Q4_K_M.gguf", with prompt "write a python program to caculate pi with monte carlo method".
Its output is worse than "gemma-2b-it-Q4_K_M.gguf" from another repository.

apepkuss79

Second State org Feb 22, 2024

@wyklq The gguf models are generated with b2230 and also tested against b2230. There are some changes introduced into llama.cpp after b2230, so we are not sure if they are compatible with b2234. But anyway, we'll track the changes on llama.cpp and update the models in the near future.
In addition, according to my personal experience, 2b-it-Q8_0 is better. You can try it.

wyklq

Feb 23, 2024

OK, it turns to be the original model's issue.
I found the discussion https://huggingface.co/google/gemma-7b-it/discussions/38
And the workaround works, i.e. set "Presence penalty" to 1.

wyklq changed discussion status to closed Feb 23, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment