Image-Text-to-Text
Transformers
Safetensors
English
qwen3_5
qwen3.5
reasoning
quantized
auto_round
conversational
4-bit precision
auto-round

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled INT4 (AutoRound)

INT4 quantized version of:

Model Quantization Details

Generate the Model

Generated with Intel AutoRound using the distillation source datasets:

auto-round --model_name Jackrong/Qwopus3.5-27B-v3-int4-AutoRound \
--bits 4 --iters 500 --nsamples 512 --enable_torch_compile \
--output_dir Qwopus3.5-27B-v3-int4-AutoRound

How to Use

vLLM

vllm serve salbeal/Qwopus3.5-27B-v3-int4-AutoRound \
    --trust-remote-code \
    --dtype bfloat16

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "salbeal/Qwopus3.5-27B-v3-int4-AutoRound"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True,
)

Limitations

  • This model can generate incorrect or misleading outputs.
  • Outputs may include biased or unsafe content depending on prompt and use case.
  • Evaluate thoroughly for safety and task quality before production use.

Acknowledgements

License

Please follow the license terms of the original source model.

Citation

If you use this quantized model, please cite both the original distilled model by Jackrong and AutoRound:

@misc{jackrong_qwen35_27b_v3
    title        = {Jackrong/Qwopus3.5-27B-v3},
    author       = {Jackrong},
    year         = {2026},
    publisher    = {Hugging Face},
    howpublished = {\url{https://huggingface.co/Jackrong/Qwopus3.5-27B-v3}}
}
@article{cheng2023optimize,
    title={Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs},
    author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi},
    journal={arXiv preprint arXiv:2309.05516},
    year={2023}
}
Downloads last month
2
Safetensors
Model size
7B params
Tensor type
I32
·
BF16
·
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for salbeal/Qwopus3.5-27B-v3-int4-AutoRound

Base model

Qwen/Qwen3.5-27B
Quantized
(29)
this model

Datasets used to train salbeal/Qwopus3.5-27B-v3-int4-AutoRound

Paper for salbeal/Qwopus3.5-27B-v3-int4-AutoRound