Instructions to use giaki3003/Ornith-1.0-9B-4bit-MTP-MLX-Serve with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use giaki3003/Ornith-1.0-9B-4bit-MTP-MLX-Serve with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Ornith-1.0-9B-4bit-MTP-MLX-Serve giaki3003/Ornith-1.0-9B-4bit-MTP-MLX-Serve
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Ornith-1.0-9B 4-bit (MLX) + MTP head (trained)
A 4-bit MLX build of Ornith-1.0-9B with a
Multi-Token-Prediction head as an mtp/weights.safetensors sidecar, for native MTP speculative
decoding in mlx-serve on Apple Silicon.
Use (mlx-serve, macOS / Apple Silicon)
Point mlx-serve at this folder — MTP auto-enables on sidecar presence (no config needed):
mlx-serve --model ./Ornith-1.0-9B-4bit-MTP-MLX-Serve
# or download via the model browser in MLX Core.app
Opt out with --no-mtp or per-request enable_mtp:false; go deeper with --mtp-depth.
The base verifies every drafted token (exact rejection sampling) — output distribution unchanged, only faster.
What's inside
- Base: 4-bit MLX Ornith-9B (
qwen3_5, hidden 4096, g64), repackaged from pavantippannagari/Ornith-1.0-9B-mlx-4Bit. mtp/weights.safetensors: KL-distilled head re-aligned to Ornith (from protoLabsAI/Ornith-1.0-9B-MTP) (15 tensors, bf16 — mlx-serve'sloadLinearaccepts plain bf16 linears).
Validation
Tensor names / shapes / dtype statically verified against mlx-serve's src/mtp.zig loader
(fc [H,2H]→[2H,H]=[8192,4096], all 15 names present, bf16 linears). Base hidden matches head; fc geometry passes validateGeometry.
Not run on-device here (built on a Linux/CUDA box; mlx-serve is Apple-Silicon only) — confirm
acceptance rate on your Mac. A GGUF sibling (smoke-tested, ~0.81–0.83 draft acceptance) is at
giaki3003/Ornith-1.0-9B-MTP-GGUF.
License: MIT (derivative of MIT-licensed Ornith-1.0-9B).
- Downloads last month
- 303
4-bit
Model tree for giaki3003/Ornith-1.0-9B-4bit-MTP-MLX-Serve
Base model
deepreinforce-ai/Ornith-1.0-9B