Post
769
Join the community of Machine Learners and AI enthusiasts.
Sign Upit works on the llama.cpp
It is how you can run it:
llama-server -ngl 999 --host 192.168.1.68 --override-kv glm4.rope.dimension_count=int:64 --override-kv tokenizer.ggml.eos_token_id=int:151336 -m /mnt/nvme0n1/LLM/quantized/GLM-4-9B-0414-Q8_0.gguf
Read here why:
Eval bug: GLM-Z1-9B-0414 · Issue #12946 · ggml-org/llama.cpp:
https://github.com/ggml-org/llama.cpp/issues/12946#issuecomment-2803564782
Ah I see. they have their own architecture.
https://github.com/huggingface/transformers/pull/37388
This will be hard.