Safetensors model file name

#1
by afrideva - opened

Many thanks for merge.

Working on quanting to gguf, seems that for single file models only "model.safetensors" is currently supported: https://github.com/ggerganov/llama.cpp/blob/fbbc42827b2949b95bcde23ce47bb47d006c895d/convert-hf-to-gguf.py#L180 .

Commenting out 180-181 is working for now, initial output looks good

Many thanks for merge.

Working on quanting to gguf, seems that for single file models only "model.safetensors" is currently supported: https://github.com/ggerganov/llama.cpp/blob/fbbc42827b2949b95bcde23ce47bb47d006c895d/convert-hf-to-gguf.py#L180 .

Commenting out 180-181 is working for now, initial output looks good

🙏thanks
ok I will rename it

hey, can you quantize this model? https://huggingface.co/NousResearch/Obsidian-3B-V0.5 I haven't seen any yet and the projector if possible

have used https://huggingface.co/nisten/obsidian-3b-multimodal-q6-gguf before.

Ran into issues last i tried quanting myself, will try with latest llama.cpp commit tomorrow

Quants up here https://huggingface.co/afrideva/Echo-3B-GGUF

thank you, I'm trying to make a 5b stablelm model here https://huggingface.co/Aryanne/testing-only I quantized locally and at the inference I got an error about graph, idk if it's a problem with the .jsons, my quantization or the model itself, can you take a look?

this is the error

GGML_ASSERT: ggml.c:15158: cgraph->n_nodes < cgraph->size         Aborted

maybe 58 was too much layers😅

maybe 58 was too much layers😅

Apparently so, just tried it with latest llama.cpp and got the same error... Your Zephyr-3.43B looks good though, quanting now

afrideva changed discussion status to closed

https://huggingface.co/acrastt/Marx-3B-V2
seems that there are not new quants of this model on hf

Sign up or log in to comment