Kimi K2.75 Code โ€” GGUF

GGUF conversions and quantizations of the experimental merged checkpoint freakyskittle/kimi-k2.75-code (a shard-wise SLERP merge of moonshotai/Kimi-K2.7-Code and moonshotai/Kimi-K2.6, DeepSeek-V3 style MoE).

See the base repository for the full merge recipe, pruning details, and license.

Parameter count: ~720 B. The bf16 and Q8_0 builds contain 1,096 tensors summing to 719.8 B parameters. Hugging Face's auto-detected badge reads ~61 B because it parses the deep55/ file first, and that build stores weights in a packed-int4 layout (int32 weight_packed tensors holding 8 weights each, plus weight_scale/weight_shape) whose logical shapes can't be summed. The true model is ~720 B params (further reduced in the deep-pruned deep55/ variant).

Files

Each variant lives in its own folder. Files over Hugging Face's 500 GB per-file limit are split into GGUF shards (-0000N-of-0000M.gguf); point your loader at the first shard and the rest are picked up automatically.

Folder Variant Approx. size Notes
deep55/ Deep-pruned, full precision ~203 GB Prune ratio 0.55, deepseek_v3 arch override
pruned-compact-oxidize-q4/ Q4_K_M (oxidize) ~406 GB Quantized from the compacted pruned checkpoint
llamacpp-q4-partial/ Q4_K_M (llama.cpp, partial) ~435 GB Partial llama.cpp quantization
unpruned-q8/ Q8_0 (unpruned) ~765 GB Split into 2 shards
pruned-bf16/ BF16 (pruned) ~1.44 TB Split into 4 shards

Provenance

  • Base merge + pruning: see freakyskittle/kimi-k2.75-code.
  • GGUF conversion via oxidize-convert with --arch deepseek_v3.
  • Sharding via llama-gguf-split (--split-max-size 450G).

Status

Experimental research artifact โ€” not fully evaluated. Validate quality before any production use. Use at your own risk.

License

Follows the Modified MIT License from Moonshot AI (see the base repository). Commercial attribution requirement applies: products/services exceeding 100M monthly active users or US$20M monthly revenue must prominently display Kimi K2.7 Code in the UI.

Downloads last month
2,936
GGUF
Model size
62B params
Architecture
deepseek_v3
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for freakyskittle/kimi-k2.75-code-GGUF