Please, update files to enable support for BitnetForCausalLM (this model type was added to llama.cpp a few hours ago)

#94
by NikolayKozloff - opened
ggml.ai org

This is supported. For context, every 6 hours, the space restarts and pulls from the latest llama.cpp (I can bring it down to 3 hours, if you think it's useful)

This is supported. For context, every 6 hours, the space restarts and pulls from the latest llama.cpp (I can bring it down to 3 hours, if you think it's useful)

I'm sorry but it seems that there is some mistake here. Look at the dates when files of your repo were updated last time (can be seen on "Files" tab):
image.png
So app.py was updated 9 days ago.

Also one of the latest llama.cpp updates added ability to create ggufs for Viking models: https://github.com/ggerganov/llama.cpp/releases/tag/b3248 But when i run your repo trying to create q8_0 gguf for this Viking model: https://huggingface.co/LumiOpen/Viking-7B, i get this error: "nWARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!" How can it be if you say that your repo is allways updated to the latest version of llama.cpp?

ggml.ai org

Good question: We don't need to update the main directory, it automatically updates it via this snipper: https://huggingface.co/spaces/ggml-org/gguf-my-repo/blob/main/app.py#L377

Are you sure that this model can be quantised in the first place? I'll factory reset the space now.

Are you sure that this model can be quantised in the first place? I'll factory reset the space now.

Factory reset helped. Thank you very much. I finally created gguf for Viking 7b: https://huggingface.co/NikolayKozloff/Viking-7B-Q8_0-GGUF

Sign up or log in to comment