How did you make these quants?

by rombodawg - opened Feb 26

Feb 26

•

Did you use llamacpp's convert.py to generate these gguf model files? Me and the community have been struggling to figure out why some of these gemma models, fine-tuned, or merged, simply are not working during inference of loading after converting to gguf. Can you share the code you used to convert to gguf from hf tensors? If it wasnt from llamacpp?

Or if it was with llamacpp, was it a new branch? or the main branch? A custom method? Please share

dranger003

Owner Feb 26

@rombodawg What do you mean? Are my gguf files working? I'm not doing anything special but if you are having issues with this model in particular you have to make sure the repeat penalty is disabled (i.e. set it to 1.0) otherwise the model will produce incoherent output.

rombodawg

Feb 27

@dranger003 Im not having issues with your gguf model files. Im having issues with every other gguf model files made from Gemma that exists. Yours seem to be the only ones that are working. The main thing is im trying to make new Gemma models with Mergekit, and those resulting models arent working after quantization. You can see the issues i have opened bellow, with multiple threads linked at the bottom of that issue

https://github.com/ggerganov/llama.cpp/issues/5706#issuecomment-1963015755

dranger003

Owner Feb 27

@rombodawg You are not using llama.cpp directly, that's why. Just use the file from the repo.
https://github.com/ggerganov/llama.cpp/blob/cbbd1efa06f8c09f9dff58ff9d9af509cc4c152b/convert-hf-to-gguf.py#L221

rombodawg

Feb 27

@dranger003 I just downloaded that file. I got this error when using it

NotImplementedError: Architecture "GemmaForCausalLM" not supported!

rombodawg

Feb 27

I dont know whats wrong with that script, i even tried converting a llama2 model and i got

NotImplementedError: Architecture "LlamaForCausalLM" not supported!

Only convert.py works for me. Like what is stated in the official documentation to use when converting to gguf

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment