This repository contains model weights for the unofficial LQ8 quantizations of Qwen3.5 2B.

LQ8 is an experimental quantization technique that is still in early beta, designed to provide fp16-level quality with the same or lower memory footprint as Q8_0.

LQ8 is currently compatible with llama.cpp and Ollama out of the box. Please create a discussion if you find a bug.

File Name Quant Type Bit Depth Size Download Link
model-LQ8.gguf LQ8 ~8 bpw 1.82 GB ๐Ÿ“ฅ Download LQ8
Downloads last month
460
GGUF
Model size
2B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for reecdev/Qwen3.5-2B-LQ8-GGUF

Finetuned
Qwen/Qwen3.5-2B
Quantized
(124)
this model