Quantization bundles
INT-quantized weight bundles for several backbone LLMs, produced by
Quant_ADMM (private). Each leaf subdir is one config.
Each config directory contains:
int_bundle.pt β packed integer weights (per-group S, Z + packed int4/3 q)
aux_state.pt β non-quantized state (RMSNorm weights, biases, embed,
lm_head). Critical for AWQ-style methods that fold per-channel
rescaling into surrounding modules.
run_args.json β exact command-line config (bitwidth, method, ADMM
hyperparams, etc.)
Layout
QuantADMM-bundles/
<backbone>/
<method>_<bits>bit_g<groupsize>_calib<dataset>_<extras>/
int_bundle.pt
aux_state.pt
run_args.json
Contents
Llama-2-13b-hf/
| subdir |
method |
wbits |
groupsize |
calib |
extra |
llama-2-13b-hf_admm_cd_4bit_g128_calibc4_noqep_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
llama-2-13b-hf_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
llama-2-13b-hf_awq_4bit_g128_calibc4_noqep |
awq |
4 |
128 |
c4 |
|
llama-2-13b-hf_babai_4bit_g128_calibc4_noqep_K5_actorder |
babai |
4 |
128 |
c4 |
|
llama-2-13b-hf_gptaq_4bit_g128_calibc4_noqep_actorder |
gptaq |
4 |
128 |
c4 |
|
llama-2-13b-hf_gptq_4bit_g128_calibc4_noqep_actorder |
gptq |
4 |
128 |
c4 |
|
llama-2-13b-hf_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
gptq |
4 |
128 |
c4 |
|
llama-2-13b-hf_quarot_gptq_4bit_g128_calibc4_noqep_actorder |
quarot_gptq |
4 |
128 |
c4 |
|
llama-2-13b-hf_spinquantR_w4_external |
spinquant_rotation_train |
4 |
? |
? |
|
llama-2-13b-hf_spinquant_gptq_4bit_g128_calibc4_noqep_actorder |
spinquant_gptq |
4 |
128 |
c4 |
|
Llama-2-7b-hf/
| subdir |
method |
wbits |
groupsize |
calib |
extra |
llama-2-7b-hf_admm_cd_4bit_g128_calibc4_noqep_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
llama-2-7b-hf_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
llama-2-7b-hf_awq_4bit_g128_calibc4_noqep |
awq |
4 |
128 |
c4 |
|
llama-2-7b-hf_babai_4bit_g128_calibc4_noqep_K5_actorder |
babai |
4 |
128 |
c4 |
|
llama-2-7b-hf_gptaq_4bit_g128_calibc4_noqep_actorder |
gptaq |
4 |
128 |
c4 |
|
llama-2-7b-hf_gptq_4bit_g128_calibc4_noqep_actorder |
gptq |
4 |
128 |
c4 |
|
llama-2-7b-hf_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
gptq |
4 |
128 |
c4 |
|
llama-2-7b-hf_quarot_gptq_4bit_g128_calibc4_noqep_actorder |
quarot_gptq |
4 |
128 |
c4 |
|
llama-2-7b-hf_spinquant_gptq_4bit_g128_calibc4_noqep_actorder |
spinquant_gptq |
4 |
128 |
c4 |
|
Meta-Llama-3-8B/
| subdir |
method |
wbits |
groupsize |
calib |
extra |
meta-llama-3-8b_admm_cd_4bit_g128_calibc4_noqep_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
meta-llama-3-8b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
meta-llama-3-8b_awq_4bit_g128_calibc4_noqep |
awq |
4 |
128 |
c4 |
|
meta-llama-3-8b_babai_4bit_g128_calibc4_noqep_K5_actorder |
babai |
4 |
128 |
c4 |
|
meta-llama-3-8b_gptaq_4bit_g128_calibc4_noqep_actorder |
gptaq |
4 |
128 |
c4 |
|
meta-llama-3-8b_gptq_4bit_g128_calibc4_noqep_actorder |
gptq |
4 |
128 |
c4 |
|
meta-llama-3-8b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
gptq |
4 |
128 |
c4 |
|
meta-llama-3-8b_quarot_gptq_4bit_g128_calibc4_noqep_actorder |
quarot_gptq |
4 |
128 |
c4 |
|
meta-llama-3-8b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder |
spinquant_gptq |
4 |
128 |
c4 |
|
Qwen3-0.6B/
| subdir |
method |
wbits |
groupsize |
calib |
extra |
qwen3-0.6b_admm_cd_4bit_g128_calibc4_noqep_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
qwen3-0.6b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
qwen3-0.6b_awq_4bit_g128_calibc4_noqep |
awq |
4 |
128 |
c4 |
|
qwen3-0.6b_babai_4bit_g128_calibc4_noqep_K0_actorder |
babai |
4 |
128 |
c4 |
|
qwen3-0.6b_babai_4bit_g128_calibc4_noqep_K5_actorder |
babai |
4 |
128 |
c4 |
|
qwen3-0.6b_gptaq_4bit_g128_calibc4_noqep_actorder |
gptaq |
4 |
128 |
c4 |
|
qwen3-0.6b_gptq_4bit_g128_calibc4_noqep_actorder |
gptq |
4 |
128 |
c4 |
|
qwen3-0.6b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
gptq |
4 |
128 |
c4 |
|
qwen3-0.6b_quarot_gptq_4bit_g128_calibc4_noqep_actorder |
quarot_gptq |
4 |
128 |
c4 |
|
qwen3-0.6b_spinquantR_w4_external |
spinquant_rotation_train |
4 |
? |
? |
|
qwen3-0.6b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder |
spinquant_gptq |
4 |
128 |
c4 |
|
Qwen3-4B/
| subdir |
method |
wbits |
groupsize |
calib |
extra |
qwen3-4b_admm_cd_4bit_g128_calibc4_noqep_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
qwen3-4b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
qwen3-4b_awq_4bit_g128_calibc4_noqep |
awq |
4 |
128 |
c4 |
|
qwen3-4b_babai_4bit_g128_calibc4_noqep_K5_actorder |
babai |
4 |
128 |
c4 |
|
qwen3-4b_gptaq_4bit_g128_calibc4_noqep_actorder |
gptaq |
4 |
128 |
c4 |
|
qwen3-4b_gptq_4bit_g128_calibc4_noqep_actorder |
gptq |
4 |
128 |
c4 |
|
qwen3-4b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
gptq |
4 |
128 |
c4 |
|
qwen3-4b_quarot_gptq_4bit_g128_calibc4_noqep_actorder |
quarot_gptq |
4 |
128 |
c4 |
|
qwen3-4b_spinquantR_w4_external |
spinquant_rotation_train |
4 |
? |
? |
|
qwen3-4b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder |
spinquant_gptq |
4 |
128 |
c4 |
|
Qwen3-8B/
| subdir |
method |
wbits |
groupsize |
calib |
extra |
qwen3-8b_admm_cd_4bit_g128_calibc4_noqep_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
qwen3-8b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
admm_cd |
4 |
128 |
c4 |
lam=0.1, iters=200 |
qwen3-8b_awq_4bit_g128_calibc4_noqep |
awq |
4 |
128 |
c4 |
|
qwen3-8b_babai_4bit_g128_calibc4_noqep_K5_actorder |
babai |
4 |
128 |
c4 |
|
qwen3-8b_gptaq_4bit_g128_calibc4_noqep_actorder |
gptaq |
4 |
128 |
c4 |
|
qwen3-8b_gptq_4bit_g128_calibc4_noqep_actorder |
gptq |
4 |
128 |
c4 |
|
qwen3-8b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder |
gptq |
4 |
128 |
c4 |
|
qwen3-8b_quarot_gptq_4bit_g128_calibc4_noqep_actorder |
quarot_gptq |
4 |
128 |
c4 |
|
qwen3-8b_spinquantR_w4_external |
spinquant_rotation_train |
4 |
? |
? |
|
qwen3-8b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder |
spinquant_gptq |
4 |
128 |
c4 |
|
How to load (in our repo)
python main.py {base_model} c4 admm_chol \
--wbits 4 --groupsize 128 \
--save-model '' \
--load-model <local-or-hf-snapshot-path>/<backbone>/<config>/int_bundle.pt
main.py automatically picks up the sibling aux_state.pt from the same
directory.
Method legend
gptq β sequential per-column rounding with Hessian-aware error compensation
admm_chol β math-clean ADMM for per-column integer least squares (our method)
awq β activation-aware per-channel rescaling + RTN fake-quantize
awq_gptq β AWQ rescaling, then GPTQ as inner quantizer
awq_admm_chol β AWQ rescaling, then ADMM-chol as inner quantizer (our method)
quip β QUIP (random-rotation + LDLQ) at no-group setting