Agents-A1-IQ4_XS-imatrix-gguf-fable5-calibrated

Fable-5 trace calibrated imatrix GGUF quant of InternScience/Agents-A1.

File

File Size SHA-256
Agents-A1-IQ4_XS-imatrix-gguf-fable5-calibrated.gguf 17.44 GiB 67bb63a2f216f48ecbd8180510d5a930b4c7a053d195d1e798cc5fbeeb70cf18

Quick Start

llama-cli -hf Chungulus/Agents-A1-IQ4_XS-imatrix-gguf-fable5-calibrated:Agents-A1-IQ4_XS-imatrix-gguf-fable5-calibrated.gguf -p "Write a Python sorting function" -n 160

Ollama

ollama create agents-a1-iq4_xs-imatrix -f Modelfile
ollama run agents-a1-iq4_xs-imatrix

Which File Should I Download?

Use case Recommendation
Recommended hardware 16-24 GB RAM
Best for efficient compact 4-bit

Quality Snapshot

F16 baseline mini accuracy: 89.58%. F16 baseline PPL on KL holdout: 13.0194.

Metric Value
Mini accuracy 87.50%
Retention vs F16 97.67%
Mean KLD vs F16 0.020185
Same top p 92.06%

Notes

  • Calibration source: Glint-Research/Fable-5-traces
  • Calibration source revision: e05c417852fc59fd8da758e68b352732423ca0cb
  • GGUF quantization method: llama.cpp with imatrix calibration.
  • Static imatrix GGUF; not Unsloth Dynamic 2.0 / UD2.
  • MTP is not included because the downloaded checkpoint did not contain MTP tensors.
  • This repo contains local quantization artifacts only.
Downloads last month
-
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chungulus/Agents-A1-IQ4_XS-imatrix-gguf-fable5-calibrated

Quantized
(45)
this model

Collection including Chungulus/Agents-A1-IQ4_XS-imatrix-gguf-fable5-calibrated