Nex-N2-mini, OptiQ 4-bit MLX

This is nex-agi/Nex-N2-mini converted to MLX format and quantized with oMLX's oQ4 mixed-precision scheme (4-bit base, sensitivity-driven bit allocation, MSE-optimal clipping). The result is about 19 GB on disk, roughly 4.7 bits per weight effective, with sensitive tensors such as the linear-attention projections kept at higher precision.

Nex-N2-mini is an agentic model built around what its authors call Agentic Thinking: it interleaves reasoning, tool use, and environment feedback rather than treating them as separate stages. The architecture is a hybrid MoE (qwen3_5_moe): 40 layers alternating linear attention with full attention every fourth layer, 256 experts with 8 active per token, and a 262k-token context window.

The original checkpoint includes a vision tower. MLX text inference does not use it, so the vision weights were dropped during conversion; this copy is text-only. Expect around 21 GB of memory in use during inference.

Usage

With mlx-lm, either directly:

mlx_lm.generate --model Nex-N2-mini-OptiQ-4bit --prompt "Hello"

or as an OpenAI-compatible server:

mlx_lm.server --model Nex-N2-mini-OptiQ-4bit

It also works out of the box with oMLX.

Tool calling works without any extra configuration. The chat template uses the Qwen3-Coder XML style, which mlx-lm and oMLX both detect automatically, so servers return proper structured tool_calls, and thinking ends up in the reasoning field instead of leaking into the response content. Tested end to end with Swival as the harness, including multi-step tasks that exercise file edits, search, and shell commands while the model is thinking.

An 8-bit companion is available at jedisct1/Nex-N2-mini-mlx-8bit.

Downloads last month
76
Safetensors
Model size
6B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jedisct1/Nex-N2-mini-mlx-OptiQ-4bit

Quantized
(45)
this model

Collection including jedisct1/Nex-N2-mini-mlx-OptiQ-4bit