Instructions to use cpral/qwen397b-3536bpw with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cpral/qwen397b-3536bpw with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="cpral/qwen397b-3536bpw") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("cpral/qwen397b-3536bpw") model = AutoModelForMultimodalLM.from_pretrained("cpral/qwen397b-3536bpw") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use cpral/qwen397b-3536bpw with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cpral/qwen397b-3536bpw" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpral/qwen397b-3536bpw", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/cpral/qwen397b-3536bpw
- SGLang
How to use cpral/qwen397b-3536bpw with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cpral/qwen397b-3536bpw" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpral/qwen397b-3536bpw", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cpral/qwen397b-3536bpw" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpral/qwen397b-3536bpw", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use cpral/qwen397b-3536bpw with Docker Model Runner:
docker model run hf.co/cpral/qwen397b-3536bpw
Configuration Parsing Warning:In config.json: "quantization_config.bits" must be an integer
Qwen3.5-397B-A17B-EXL3
pick a quant with lower KL and ppl that suits your hardware.
| Quant | GiB | GB | bpw | PPL | KL(qโo) | KL(oโq) | Top-1 | Top-2 | Top-3 | Top-4 | Top-5 |
|---|---|---|---|---|---|---|---|---|---|---|---|
| MikeRoz 2.0bpw | 97 | 104 | 2.00 | 5.072 | 0.5160 | 0.8210 | 76.1% | 41.3% | 18.6% | 7.5% | 2.9% |
| MikeRoz 2.08bpw | 100 | 107 | 2.08 | 3.386 | 0.1210 | 0.1630 | 89.3% | 62.6% | 38.3% | 21.6% | 11.7% |
| cpral 2.20bpw | 104 | 112 | 2.20 | 3.381 | 0.1198 | 0.1591 | 89.4% | 62.8% | 38.5% | 21.6% | 11.6% |
| cpral 2.36bpw | 113 | 121 | 2.36 | 3.260 | 0.0819 | 0.1054 | 91.6% | 68.1% | 44.6% | 27.1% | 15.7% |
| cpral 2.64bpw | 126 | 135 | 2.64 | 3.139 | 0.0429 | 0.0490 | 94.1% | 75.5% | 54.3% | 36.5% | 23.4% |
| cpral 2.93bpw | 139 | 149 | 2.93 | 3.117 | 0.0319 | 0.0349 | 94.8% | 78.1% | 58.3% | 40.6% | 27.0% |
| NeuroSenko 3.0bpw | 142 | 153 | 3.00 | 3.220 | 0.0674 | 0.0776 | 91.9% | 68.4% | 44.5% | 26.6% | 14.8% |
| NeuroSenko 3.03bpw | 143 | 154 | 3.03 | 3.173 | 0.0474 | 0.0531 | 93.5% | 73.4% | 51.1% | 32.9% | 20.1% |
| cpral 3.11bpw | 147 | 158 | 3.11 | 3.114 | 0.0270 | 0.0296 | 95.3% | 79.8% | 60.7% | 43.3% | 29.6% |
| cpral 3.29bpw | 156 | 167 | 3.29 | 3.089 | 0.0200 | 0.0213 | 96.0% | 82.1% | 64.3% | 47.4% | 33.5% |
| cpral 3.45bpw | 163 | 175 | 3.45 | 3.081 | 0.0159 | 0.0166 | 96.4% | 83.7% | 67.3% | 51.2% | 37.3% |
| mratsim 3.47bpw | 164 | 175 | 3.47 | 3.096 | 0.0203 | 0.0216 | 96.0% | 82.2% | 64.7% | 48.1% | 34.1% |
| cpral 3.53bpw | 167 | 179 | 3.53 | 3.075 | 0.0134 | 0.0139 | 96.7% | 84.9% | 69.3% | 53.5% | 39.8% |
| cpral 3.54bpw (this repo) | 167 | 179 | 3.54 | 3.116 | 0.0286 | 0.0313 | 95.0% | 78.7% | 59.2% | 41.7% | 28.0% |
| cpral 3.57bpw | 169 | 181 | 3.57 | 3.072 | 0.0127 | 0.0130 | 96.7% | 85.2% | 69.8% | 54.2% | 40.4% |
| cpral 3.68bpw | 173 | 186 | 3.68 | 3.069 | 0.0120 | 0.0122 | 96.9% | 85.7% | 70.6% | 55.1% | 41.3% |
| NeuroSenko 4.0bpw | 188 | 202 | 4.00 | 3.101 | 0.0203 | 0.0210 | 95.7% | 81.0% | 62.3% | 44.7% | 30.5% |
| NeuroSenko 4.03bpw | 189 | 203 | 4.03 | 3.082 | 0.0149 | 0.0153 | 96.3% | 83.9% | 67.2% | 50.7% | 36.6% |
| cpral 4.61bpw | 216 | 232 | 4.61 | 3.059 | 0.0054 | 0.0054 | 97.8% | 90.0% | 78.4% | 65.4% | 52.6% |
| NeuroSenko 5.0bpw | 234 | 252 | 5.00 | 3.067 | 0.0079 | 0.0079 | 97.3% | 87.6% | 73.9% | 59.0% | 45.3% |
| mratsim 8.0bpw | 385 | 400 | 8.00 | 3.055 | 0.0025 | 0.0026 | 98.6% | 93.3% | 85.0% | 75.1% | 64.7% |
| original | bf16 | 752 | 807 | 16.00 | 3.053 | โ | โ | โ | โ | โ | โ |
- Downloads last month
- 5