Qwen3.5 0.8B SpectralQuant Q4_K_M

SpectralQuant Q4_K_M is a compact GGUF release of Qwen/Qwen3.5-0.8B, built with a new calibration-aware quantization approach. Instead of treating Q4 compression as simple local rounding, SpectralQuant shapes the quantized representation around behaviorally important directions, keeping the normal Q4_K_M footprint while preserving substantially more of the BF16 reference behavior.

A detailed technical blog post describing the method and research path is planned soon.

Highlights

  • 4.52 BPW fixed-footprint GGUF: 435,896,640 bytes / 415.7 MiB.
  • 96.5% heldout120 BF16-gap recovery versus llama.cpp pure Q4_K_M.
  • Lower prompt loss than tested Unsloth Q4_K_S, Q4_K_M, IQ4_NL, and IQ4_XS while using fewer bytes.
  • C4 validation improves over llama.cpp pure Q4_K_M at the same Q4_K_M footprint.
  • No FP-kept modules, no mixed-precision sidecar, and no larger dynamic quant format.

SpectralQuant release loss chart

Model Details

Item Value
Base model Qwen/Qwen3.5-0.8B
Format GGUF
Quantization SpectralQuant Q4_K_M
File qwen35-0.8b-spectralquant-calib360-q4_k_m.gguf
Size 435,896,640 bytes
SHA256 ae3c3e6dbb3d08c83d12e12c2f67bf63527f39e5090ce5c0eb12eacd5417f352
License Apache-2.0

This release is quantized from Qwen/Qwen3.5-0.8B. It is not quantized directly from Qwen/Qwen3.5-0.8B-Base; any Base -> Qwen3.5-0.8B -> Quantized lineage shown by the Hub reflects the upstream Qwen model relationship.

Method Overview

SpectralQuant is a calibration-aware Q4_K_M quantization approach. At a high level, it shapes quantization error around behaviorally important directions instead of treating every local rounding error equally. The goal is simple: keep the familiar Q4_K_M deployment footprint while retaining more of the model behavior users normally associate with larger quantizations.

A detailed technical blog post describing the method and research path is planned soon.

Evaluation

Lower loss is better. BPW is estimated from file size relative to llama.cpp pure Q4_K_M at 4.52 BPW. BF16 is included as a full-precision reference.

Model BPW est. Size MiB convergence60 ? heldout120 ?
BF16 reference 16.01 1446.5 2.2682 2.9809
Unsloth UD-Q4_K_XL 5.79 532.9 2.2833 2.9913
SpectralQuant Q4_K_M 4.52 415.7 2.2509 2.9961
Unsloth IQ4_NL 5.26 483.4 2.3289 3.0484
Unsloth Q4_K_M 5.52 507.8 2.3268 3.0510
Unsloth Q4_K_S 5.27 484.6 2.3126 3.0700
Unsloth IQ4_XS 5.11 469.8 2.3869 3.1061
llama.cpp pure Q4_K_M 4.52 415.7 2.7404 3.4135

BF16 Gap Recovery

Suite Pure Q4_K_M loss BF16 reference loss SpectralQuant loss Recovery vs BF16 gap
convergence60 2.740441 2.268226 2.250946 100.00%
heldout120 3.413494 2.980932 2.996070 96.50%

C4 Validation

Suite llama.cpp pure Q4_K_M SpectralQuant Q4_K_M Unsloth Q4_K_M
C4 validation 64x256 3.3014 3.2874 3.2574

Quickstart

llama-cli -m qwen35-0.8b-spectralquant-calib360-q4_k_m.gguf   -p "Explain quantization in two sentences."   -n 80
llama-server -m qwen35-0.8b-spectralquant-calib360-q4_k_m.gguf -c 4096

Notes

  • Release metrics are prompt-loss and C4 validation results from fixed evaluation suites.
  • The main claim is bounded to this release table and same-footprint Q4_K_M behavior.
  • Larger or dynamic quantizations can still win in some settings; evaluate on your workload before deployment.
  • The base model and referenced GGUF source list Apache-2.0 licensing.

Attribution

  • Base model: Qwen/Qwen3.5-0.8B
  • Reference GGUF source: unsloth/Qwen3.5-0.8B-GGUF
  • Quantization: SpectralQuant Q4_K_M
Downloads last month
-
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Spectral-Labs25/Qwen3.5-0.8B-SpectralQuant-Q4_K_M

Quantized
(157)
this model