Configuration Parsing Warning:Config file config.json cannot be fetched (too big)

Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)

Qwen3.5-0.8B-Instruct-NF4

NF4 4-bit quantized version of Qwen/Qwen3.5-0.8B, optimized for low VRAM environments.

Quantization Details

  • Method: bitsandbytes NF4 (4-bit NormalFloat)
  • Compute dtype: float16
  • Double quantization: enabled
  • Coverage: includes vision encoder (unlike AWQ)
  • Model size: ~823 MB (original ~1.6 GB)

License

Apache 2.0 — based on Qwen/Qwen3.5-0.8B by Alibaba Cloud.

Downloads last month
25
Safetensors
Model size
0.9B params
Tensor type
F32
·
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for s74101234/Qwen3.5-0.8B-Instruct-NF4

Finetuned
(225)
this model