Llama-3.2-1B-RYS-10-13

Llama-3.2-1B-Instruct with layers 10-12 duplicated. The late-stack reasoning circuit runs twice on every forward pass.

16 base layers โ†’ 19 after duplication. No training, no merging, no weight changes.

Reasoning 0.00% โ†’ 64.71% (+64.71). EQ 27.11 โ†’ 90.12 (+63.01). Math 0.536 โ†’ 0.7112 (+17.52). Peak reasoning ฮ” across the full 22-config sweep is +76.47%; (10,13) is the best-combined pick.

Results

Metric Baseline RYS (10,13) Delta
Math 0.536 0.7112 +17.52
EQ 27.11 90.12 +63.01
Reasoning 0.00% 64.71% +64.71

The reasoning unlock. Llama-3.2-1B-Instruct has the lowest baseline reasoning in the v2 corpus (0.00% โ€” the model fails every reasoning probe). Duplicating layers 10-12 lifts reasoning by 64.71 points at the best-combined config, and 76.47 points at the peak-reasoning config โ€” the most dramatic RYS result in the entire 21-model corpus. The mechanism is under-training scale: latent reasoning circuitry that did not reach reliable behavior during base training, surfaced by giving the activations a second pass through the same circuit.

All 22 swept configurations boost reasoning >5%. Pick this when you want a tiny model that thinks.

Usage

llama-server -m Llama-3.2-1B-RYS-10-13-Q4_K_M.gguf -ngl 99

Full sweep data

22 configurations tested. (10,13) block-3 is the best-combined pick. Full per-config sweep + cross-architecture analysis: v2 dataset.

Part of the RYS Sovereign Collection v2.


Where this sits in the Sovereign Collection

v1 โ€” Qwen2.5 cross-scale + Qwen3-32B headline crossover. 5 model repos: 0.5B EQ specialist / 1.5B daily driver / 7B math specialist (+ AWQ) / Qwen3-32B "Big Boy."

v2 โ€” cross-architecture corpus. 21 model variants across 10 architecture families. Inverse correlation (r = โˆ’0.726): weak baselines lift more, in their weakest dimension. Three distinct mechanisms identified: under-training scale (this model), MoE routing inefficiency (Granite-3.1-1B-A400M), specialization training trade-off (Qwen2.5-Coder-1.5B). Plus EQ-amplifier extreme (TinyLlama-1.1B) and a first published negative result (SmolLM2-1.7B). 13 deployable RYS-applied weight repos covering every non-zero-lift variant.

Within-family sibling: john-broadway/Llama-3.2-3B-RYS-21-24-GGUF โ€” the math-amplifier case at the high-baseline end. Together with this model, the Llama-3.2 family spans the entire baseline-vs-magnitude curve in the v2 corpus.

Credit

John Broadway, with collaboration from Claude (Opus 4.6 in April 2026 sweep generation and build pipeline; Opus 4.7 in May 2026 cross-architecture analysis and publication). Original RYS method by David Ng on Qwen2-72B; sweep + probe toolkit by alainnothere.

Downloads last month
184
GGUF
Model size
1B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for john-broadway/Llama-3.2-1B-RYS-10-13-GGUF

Quantized
(375)
this model

Collection including john-broadway/Llama-3.2-1B-RYS-10-13-GGUF