Quantization bundles

INT-quantized weight bundles for several backbone LLMs, produced by Quant_ADMM (private). Each leaf subdir is one config.

Each config directory contains:

int_bundle.pt — packed integer weights (per-group S, Z + packed int4/3 q)
aux_state.pt — non-quantized state (RMSNorm weights, biases, embed, lm_head). Critical for AWQ-style methods that fold per-channel rescaling into surrounding modules.
run_args.json — exact command-line config (bitwidth, method, ADMM hyperparams, etc.)

Layout

QuantADMM-bundles/
  <backbone>/
    <method>_<bits>bit_g<groupsize>_calib<dataset>_<extras>/
      int_bundle.pt
      aux_state.pt
      run_args.json

`Llama-2-13b-hf/`

subdir	method	wbits	groupsize	calib	extra
`llama-2-13b-hf_admm_cd_4bit_g128_calibc4_noqep_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`llama-2-13b-hf_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`llama-2-13b-hf_awq_4bit_g128_calibc4_noqep`	`awq`	4	128	c4
`llama-2-13b-hf_babai_4bit_g128_calibc4_noqep_K5_actorder`	`babai`	4	128	c4
`llama-2-13b-hf_gptaq_4bit_g128_calibc4_noqep_actorder`	`gptaq`	4	128	c4
`llama-2-13b-hf_gptq_4bit_g128_calibc4_noqep_actorder`	`gptq`	4	128	c4
`llama-2-13b-hf_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`gptq`	4	128	c4
`llama-2-13b-hf_quarot_gptq_4bit_g128_calibc4_noqep_actorder`	`quarot_gptq`	4	128	c4
`llama-2-13b-hf_spinquantR_w4_external`	`spinquant_rotation_train`	4	?	?
`llama-2-13b-hf_spinquant_gptq_4bit_g128_calibc4_noqep_actorder`	`spinquant_gptq`	4	128	c4

`Llama-2-7b-hf/`

subdir	method	wbits	groupsize	calib	extra
`llama-2-7b-hf_admm_cd_4bit_g128_calibc4_noqep_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`llama-2-7b-hf_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`llama-2-7b-hf_awq_4bit_g128_calibc4_noqep`	`awq`	4	128	c4
`llama-2-7b-hf_babai_4bit_g128_calibc4_noqep_K5_actorder`	`babai`	4	128	c4
`llama-2-7b-hf_gptaq_4bit_g128_calibc4_noqep_actorder`	`gptaq`	4	128	c4
`llama-2-7b-hf_gptq_4bit_g128_calibc4_noqep_actorder`	`gptq`	4	128	c4
`llama-2-7b-hf_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`gptq`	4	128	c4
`llama-2-7b-hf_quarot_gptq_4bit_g128_calibc4_noqep_actorder`	`quarot_gptq`	4	128	c4
`llama-2-7b-hf_spinquant_gptq_4bit_g128_calibc4_noqep_actorder`	`spinquant_gptq`	4	128	c4

`Meta-Llama-3-8B/`

subdir	method	wbits	groupsize	calib	extra
`meta-llama-3-8b_admm_cd_4bit_g128_calibc4_noqep_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`meta-llama-3-8b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`meta-llama-3-8b_awq_4bit_g128_calibc4_noqep`	`awq`	4	128	c4
`meta-llama-3-8b_babai_4bit_g128_calibc4_noqep_K5_actorder`	`babai`	4	128	c4
`meta-llama-3-8b_gptaq_4bit_g128_calibc4_noqep_actorder`	`gptaq`	4	128	c4
`meta-llama-3-8b_gptq_4bit_g128_calibc4_noqep_actorder`	`gptq`	4	128	c4
`meta-llama-3-8b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`gptq`	4	128	c4
`meta-llama-3-8b_quarot_gptq_4bit_g128_calibc4_noqep_actorder`	`quarot_gptq`	4	128	c4
`meta-llama-3-8b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder`	`spinquant_gptq`	4	128	c4

`Qwen3-0.6B/`

subdir	method	wbits	groupsize	calib	extra
`qwen3-0.6b_admm_cd_4bit_g128_calibc4_noqep_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`qwen3-0.6b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`qwen3-0.6b_awq_4bit_g128_calibc4_noqep`	`awq`	4	128	c4
`qwen3-0.6b_babai_4bit_g128_calibc4_noqep_K0_actorder`	`babai`	4	128	c4
`qwen3-0.6b_babai_4bit_g128_calibc4_noqep_K5_actorder`	`babai`	4	128	c4
`qwen3-0.6b_gptaq_4bit_g128_calibc4_noqep_actorder`	`gptaq`	4	128	c4
`qwen3-0.6b_gptq_4bit_g128_calibc4_noqep_actorder`	`gptq`	4	128	c4
`qwen3-0.6b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`gptq`	4	128	c4
`qwen3-0.6b_quarot_gptq_4bit_g128_calibc4_noqep_actorder`	`quarot_gptq`	4	128	c4
`qwen3-0.6b_spinquantR_w4_external`	`spinquant_rotation_train`	4	?	?
`qwen3-0.6b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder`	`spinquant_gptq`	4	128	c4

`Qwen3-4B/`

subdir	method	wbits	groupsize	calib	extra
`qwen3-4b_admm_cd_4bit_g128_calibc4_noqep_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`qwen3-4b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`qwen3-4b_awq_4bit_g128_calibc4_noqep`	`awq`	4	128	c4
`qwen3-4b_babai_4bit_g128_calibc4_noqep_K5_actorder`	`babai`	4	128	c4
`qwen3-4b_gptaq_4bit_g128_calibc4_noqep_actorder`	`gptaq`	4	128	c4
`qwen3-4b_gptq_4bit_g128_calibc4_noqep_actorder`	`gptq`	4	128	c4
`qwen3-4b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`gptq`	4	128	c4
`qwen3-4b_quarot_gptq_4bit_g128_calibc4_noqep_actorder`	`quarot_gptq`	4	128	c4
`qwen3-4b_spinquantR_w4_external`	`spinquant_rotation_train`	4	?	?
`qwen3-4b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder`	`spinquant_gptq`	4	128	c4

`Qwen3-8B/`

subdir	method	wbits	groupsize	calib	extra
`qwen3-8b_admm_cd_4bit_g128_calibc4_noqep_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`qwen3-8b_admm_cd_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`admm_cd`	4	128	c4	lam=0.1, iters=200
`qwen3-8b_awq_4bit_g128_calibc4_noqep`	`awq`	4	128	c4
`qwen3-8b_babai_4bit_g128_calibc4_noqep_K5_actorder`	`babai`	4	128	c4
`qwen3-8b_gptaq_4bit_g128_calibc4_noqep_actorder`	`gptaq`	4	128	c4
`qwen3-8b_gptq_4bit_g128_calibc4_noqep_actorder`	`gptq`	4	128	c4
`qwen3-8b_gptq_4bit_g128_calibc4_qep_damp1.0_corr0.5_actorder`	`gptq`	4	128	c4
`qwen3-8b_quarot_gptq_4bit_g128_calibc4_noqep_actorder`	`quarot_gptq`	4	128	c4
`qwen3-8b_spinquantR_w4_external`	`spinquant_rotation_train`	4	?	?
`qwen3-8b_spinquant_gptq_4bit_g128_calibc4_noqep_actorder`	`spinquant_gptq`	4	128	c4

How to load (in our repo)

python main.py {base_model} c4 admm_chol \
    --wbits 4 --groupsize 128 \
    --save-model '' \
    --load-model <local-or-hf-snapshot-path>/<backbone>/<config>/int_bundle.pt

main.py automatically picks up the sibling aux_state.pt from the same directory.

Method legend

gptq — sequential per-column rounding with Hessian-aware error compensation
admm_chol — math-clean ADMM for per-column integer least squares (our method)
awq — activation-aware per-channel rescaling + RTN fake-quantize
awq_gptq — AWQ rescaling, then GPTQ as inner quantizer
awq_admm_chol — AWQ rescaling, then ADMM-chol as inner quantizer (our method)
quip — QUIP (random-rotation + LDLQ) at no-group setting

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

ZiyuZhao98
/

QuantADMM-bundles