A lot of "G"s

#1
by tastypear - opened

I use the latest llamaserver to load this model, it outputs "GGGGGGGGG..." (also in Very_Berry_Qwen2_7B gguf)

Maybe it's a lora related bug I'm not sure.

I am running it fine in Llama.RN on mobile, are you sure your llamacpp is up to date?

@tastypear you could also try the official quants from mradermacher. The ones on my page are just meant for debugging and are generated automatically.

https://huggingface.co/mradermacher/Very_Berry_Qwen2_7B-i1-GGUF

Berry v2 is in the queue now.

Oh, I figured out the issue; it's not a problem with the model.

When loading using the llama.cpp server, it requires the -fa argument (flash attention).

Anyway, thank you for your response🤗

Sign up or log in to comment