Instructions to use salbeal/Qwopus3.5-27B-v3-int4-AutoRound with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use salbeal/Qwopus3.5-27B-v3-int4-AutoRound with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="salbeal/Qwopus3.5-27B-v3-int4-AutoRound")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("salbeal/Qwopus3.5-27B-v3-int4-AutoRound")
model = AutoModelForMultimodalLM.from_pretrained("salbeal/Qwopus3.5-27B-v3-int4-AutoRound")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use salbeal/Qwopus3.5-27B-v3-int4-AutoRound with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "salbeal/Qwopus3.5-27B-v3-int4-AutoRound"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "salbeal/Qwopus3.5-27B-v3-int4-AutoRound",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/salbeal/Qwopus3.5-27B-v3-int4-AutoRound

SGLang

How to use salbeal/Qwopus3.5-27B-v3-int4-AutoRound with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "salbeal/Qwopus3.5-27B-v3-int4-AutoRound" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "salbeal/Qwopus3.5-27B-v3-int4-AutoRound",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "salbeal/Qwopus3.5-27B-v3-int4-AutoRound" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "salbeal/Qwopus3.5-27B-v3-int4-AutoRound",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use salbeal/Qwopus3.5-27B-v3-int4-AutoRound with Docker Model Runner:
```
docker model run hf.co/salbeal/Qwopus3.5-27B-v3-int4-AutoRound
```

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled INT4 (AutoRound)

INT4 quantized version of:

Jackrong/Jackrong/Qwopus3.5-27B-v3

Model Quantization Details

Quantization tool: Intel AutoRound
Quantization type: INT4
Group size: 512
Symmetric quantization: Yes
Iterations: 500
Calibration data: NeelNanda/pile-10k

Generate the Model

Generated with Intel AutoRound using the distillation source datasets:

auto-round --model_name Jackrong/Qwopus3.5-27B-v3-int4-AutoRound \
--bits 4 --iters 500 --nsamples 512 --enable_torch_compile \
--output_dir Qwopus3.5-27B-v3-int4-AutoRound

How to Use

vLLM

vllm serve salbeal/Qwopus3.5-27B-v3-int4-AutoRound \
    --trust-remote-code \
    --dtype bfloat16

Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "salbeal/Qwopus3.5-27B-v3-int4-AutoRound"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
        model_id,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True,
)

Limitations

This model can generate incorrect or misleading outputs.
Outputs may include biased or unsafe content depending on prompt and use case.
Evaluate thoroughly for safety and task quality before production use.

Acknowledgements

Original distilled model author: Jackrong
Quantization tooling: Intel AutoRound

License

Please follow the license terms of the original source model.

Citation

If you use this quantized model, please cite both the original distilled model by Jackrong and AutoRound:

@misc{jackrong_qwen35_27b_v3
    title        = {Jackrong/Qwopus3.5-27B-v3},
    author       = {Jackrong},
    year         = {2026},
    publisher    = {Hugging Face},
    howpublished = {\url{https://huggingface.co/Jackrong/Qwopus3.5-27B-v3}}
}

@article{cheng2023optimize,
    title={Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs},
    author={Cheng, Wenhua and Zhang, Weiwei and Shen, Haihao and Cai, Yiyang and He, Xin and Lv, Kaokao and Liu, Yi},
    journal={arXiv preprint arXiv:2309.05516},
    year={2023}
}