MiniCPM5-1B-GGUF (Q4_K_M)

Mirror of openbmb/MiniCPM5-1B-GGUF's MiniCPM5-1B-Q4_K_M.gguf. Used by the IceSpiritAI_Chat Android app (MiniCPM5-1B GGUF backend via llama.cpp; alternative to the default Qwen3.5-2B-MNN LLM).

Identity

Field Value
Source huggingface.co/openbmb/MiniCPM5-1B-GGUF (official)
File MiniCPM5-1B-Q4_K_M.gguf
Size 688,065,920 bytes (656.30 MiB)
SHA-256 81b64d05a23b17b34c475f42b3e72fbde62d4b92cc34541f7a8031d0752deafa
Architecture Standard LlamaForCausalLM (per OpenBMB model card)
Params 1.08B (24 layers, GQA 16+2, ctx 131072)
Tokenizer gpt2 (llama-bpe pre-tokenizer)
Uploaded 2026-06-25

Why this mirror exists

IceSpiritAI_Chat is a dual-LLM Android app. The default LLM is Qwen3.5-2B-MNN (small, fast, on-device MNN); the alternative is MiniCPM5-1B-GGUF (slightly larger, higher-quality generations, served by a llama.cpp native pipeline). Users in mainland China without reliable access to huggingface.co can use this mirror or the ModelScope mirror AlexZh/MiniCPM5-1B-GGUF.

Downloads last month
161
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alexmultiagent/MiniCPM5-1B-GGUF

Quantized
(41)
this model