Ornith-1.0-35B-bf16

Full-precision (bfloat16) MLX build of deepreinforce-ai/Ornith-1.0-35B, produced with mlx-vlm 0.6.3. Full multimodal: vision encoder + language model, no precision loss. For Apple Silicon. Runs in mlx-vlm or any MLX app.

≈70 GB on disk; fits in 128 GB unified memory. Use a quantized sibling (3-, 4-, 5-, 6- or 8-bit) on smaller machines.

Conversion note (MoE expert fusion)

Ornith stores its 256 MoE experts unfused (per-expert), but mlx-vlm's qwen3_5_moe loader expects them fused/batched. A sanitize monkeypatch was required to stack the experts before conversion; without it the conversion failed.

Usage

uvx --from mlx-vlm mlx_vlm.generate \
  --model mlx-community/Ornith-1.0-35B-bf16 --image image.png \
  --prompt "Describe this image." --max-tokens 512

from mlx_vlm import load, generate
model, processor = load("mlx-community/Ornith-1.0-35B-bf16")

Conversion check

Smoke-tested after conversion: coherent on both an image prompt (correctly read an evaluation bar chart) and a text reasoning prompt (17 * 24 solved as 408), no repetition loop. 69 tok/s generation, peak 72 GB on a Macbook Pro M5 Max 128GB 40 GPU.

Refer to the original model card for architecture, benchmarks, license, and intended use.

Downloads last month: 235

Safetensors

Model size

35B params

Tensor type

BF16

MLX

Hardware compatibility

Quantized

Model tree for mlx-community/Ornith-1.0-35B-bf16

Base model

deepreinforce-ai/Ornith-1.0-35B

Finetuned

(11)

this model

Collection including mlx-community/Ornith-1.0-35B-bf16

Ornith 1.0

Collection

MLX versions of Ornith 1.0 • 6 items • Updated 3 days ago • 3