๐Ÿ“Ÿ Qwen3.6-35B-MoE-Imatrix-IQ3_M.gguf (2026 Edition)

"Local intelligence... to the max."

This is a custom-quantized version of Qwen3.6-35B-A3B, specifically optimized to obtain the highest possible local byte-intelligence ratio with 24GB+ RAM consumer laptops or computers.

๐Ÿง  Why this model is different

Unlike a standard quant, this model was processed using a custom Importance Matrix (imatrix). The training data for the imatrix was hand-curated to preserve:

  • Incredible reasoning: Inclusion of custom coding examples built with frontier models provides high retention of very specific and sharp architectural reasoning skills
  • Logical Flow: Inclusion of llama.cpp source code, logic puzzles, and historical writing in the imatrix training to ensure the model stays coherent at low bitrates.
  • High Speed: Built using llama.cpp specifically for local-first AI and edge computing setups like apple silicon with minimum 24GB RAM

๐Ÿ›  Quantization Details

  • Base Model: Qwen3.6-35B-A3B
  • Quantization: IQ3_M
  • Format: GGUF
  • Size: ~15.44 GB
  • Context Length: 262144 tokens

๐Ÿ“ˆ Perplexity Benchmarks

The following results were generated using llama-perplexity on the wikitext-2-raw/wiki.test.raw dataset.

Model Precision Perplexity (PPL) ฮ” PPL
Qwen3.6-35B-A3B- (no-imatrix) IQ3_M 8.3352 -
Qwen3.6-35b-A3B- (Imatrix) IQ3_M 7.0862 -1.249

โš–๏ธ Evaluation Verdict

The IQ3_M (Imatrix) delivers performance closer to 4-bit quants while maintaining the memory footprint of a 3-bit model. In the context of PPL, lower is better. It measures the model's "uncertainty" when predicting the next token. A delta of -1.249 at this scale indicates that the I-Matrix successfully mitigated the quantization noise that usually plagues sub-4-bit models.

๐Ÿš€ Hardware Performance (Apple M2)

coming soon

๐ŸŒ Links

Check out my other models!


24GB+ (RAM)

Qwen3.6-27B-SuperDense.

Gemma4-31B-SuperDense.


8GB+ (RAM)

Qwen3.5-9B-SuperDense.

Qwen3.5-4B-SuperDense.

Gemma4-4B-SuperDense.

Gemma4-2B-SuperDense.


4GB+ (RAM)

Smartchild.


All make excellent companions to this model!


Downloads last month
856
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for macwhisperer/Qwen3.6-35B-SuperMoE

Quantized
(384)
this model