Nex-N2-Pro-GGUF

Overview

This repository contains the GGUF quantized files for nex-agi/Nex-N2-Pro.


An agentic model with Agentic Thinking.

Today, we are officially releasing and open-sourcing our next-generation model, Nex-N2 — an agent model built for real-world productivity scenarios. With first-tier coding and agentic capabilities, Nex-N2 keeps driving complex, long-horizon tasks forward in real environments to deliver stable, end-to-end results.

Over the past year, a paradigm shift led by Vibe Coding and Harness Engineering has been redefining the limits of LLM agents. From dialogue, to reasoning, to agents that execute long-horizon tasks with environmental feedback, the tasks models must handle keep growing harder, the contexts longer, and the environments more realistic. The core of next-generation model competition is no longer whether a model can think, but whether it can reliably and efficiently turn thinking into actions that are executable, verifiable, and iterable.

Rather than treating reasoning, tool use, and environment execution as separate capabilities, Nex-N2 unifies them through an Agentic Thinking framework that connects requirement understanding, task planning, code implementation, environmental feedback, evaluation and debugging, and continuous iteration into a single closed loop. The framework has two parts:

  • Adaptive Thinking lets the model decide on its own when to think and how deeply — executing simple actions quickly while reasoning thoroughly on critical decisions.
  • Coherent Thinking carries one consistent reasoning paradigm across general reasoning and diverse agentic tasks, staying consistent across tasks and modalities to enable stable capability transfer.

Across real agentic workflows — agentic coding, deep research, tool calling, and terminal execution — Nex-N2 reaches first-tier performance, with substantial gains over the previous-generation Nex-N1 on multiple authoritative benchmarks. In real productivity scenarios such as OpenClaw one-person-company workflows, end-to-end game development, and web and multimodal generation, it likewise demonstrates outstanding usability, robustness, and stability.


Performance

Benchmark Nex-N2-mini Nex-N2-Pro GPT-5.5 Opus 4.7 Kimi-K2.6 GLM-5.1 MiniMax M3 DeepSeek-V4-Pro
Agent
BrowseComp 74.1 83.7 84.4 79.8 83.2 79.3 83.5 83.4
GDPval 1402 1585 1769 1753 1481 1535 - 1554
Toolathlon 33.3 51.9 55.6 52.8 50.0 40.7 - 51.8
WildClawBench 47.7 53.5 58.2 62.2 - 48.2 - 43.7
WideSearch 62.0 75.6 - - 80.8 - - -
TAU3 65.9 71.1 - - - 70.6 - -
Coding & SWE
SWE-Bench Pro 50.2 58.8 58.6 64.3 58.6 58.4 59.0 55.4
Terminal-Bench 2.1 60.7 75.3 83.4 69.7 - 58.7 66.0 72.0
DeepSWE 8.0 33.6 70 54 24 18 - 8
SWE-Bench Verified 74.4 80.8 82.9 87.6 80.2 - 80.5 80.6
SWE Atlas QnA 31.5 37.9 45.4 45.2 - - 37.9 -
SWE Atlas RF 30.0 32.9 44.8 48.6 - - - -
SWE Atlas TW 23.3 40.0 42.6 38.2 - - 30.8 -
General & Reasoning
GPQA Diamond 82.6 90.7 93.6 94.2 90.5 86.2 - 90.1
IFEval 89.1 94.0 - - 94.5 94.5 - 91.9
Apex 9.4 36.5 - - 24.0 11.5 - 38.3

Nex-N2 Benchmark Overview


How to Use

These GGUF files are fully compatible with llama.cpp and popular graphical interfaces like LM Studio, Ollama.

Example using llama.cpp CLI:

./llama-cli -m nex-n2-pro-Q2_K-00001-of-00023.gguf \
  -p "Hello, how are you?" \
  -sys "You are a helpful AI" \
  -n 4096 \
  -c 8192
Downloads last month
1,362
GGUF
Model size
403B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

2-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for morikomorizz/Nex-N2-Pro-MTP-GGUF-Experimental

Quantized
(29)
this model

Collection including morikomorizz/Nex-N2-Pro-MTP-GGUF-Experimental