@onekq on Hugging Face: "Heard good things about this model and no inference providers support it ...…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

onekq

posted an update 3 days ago

Post

769

Heard good things about this model and no inference providers support it ...

THUDM/GLM-4-9B-0414

JLouisBiz

3 days ago

it works on the llama.cpp

It is how you can run it:

llama-server -ngl 999 --host 192.168.1.68 --override-kv glm4.rope.dimension_count=int:64 --override-kv tokenizer.ggml.eos_token_id=int:151336 -m /mnt/nvme0n1/LLM/quantized/GLM-4-9B-0414-Q8_0.gguf

Read here why:

Eval bug: GLM-Z1-9B-0414 · Issue #12946 · ggml-org/llama.cpp:
https://github.com/ggml-org/llama.cpp/issues/12946#issuecomment-2803564782

onekq

3 days ago

Ah I see. they have their own architecture.

https://github.com/huggingface/transformers/pull/37388

This will be hard.

In this post

onekq Yi Cui
JLouisBiz Jean Louis