Intel
/

Qwen3.5-35B-A3B-int4-AutoRound

4-bit precision

Model card Files Files and versions

wenhuach commited on Feb 28

Commit

7cdbcbb

·

verified ·

1 Parent(s): 1fe3137

Update README.md

Files changed (1) hide show

README.md +9 -0

README.md CHANGED Viewed

@@ -8,6 +8,15 @@ This model is a  int4 model with group_size 128 of [Qwen/Qwen3.5-35B-A3B](https:
 ## vllm Infernece Example
 ~~~bash
 vllm serve Intel/Qwen3.5-35B-A3B-int4-AutoRound  --port 8000   --tensor-parallel-size 1  --max-model-len 2048 --reasoning-parser qwen3 --served-model-name qwen
 ~~~

 ## vllm Infernece Example
+~~~bash
+pip install git+https://github.com/vllm-project/vllm.git@main
+pip install git+https://github.com/huggingface/transformers.git
+~~~
 ~~~bash
 vllm serve Intel/Qwen3.5-35B-A3B-int4-AutoRound  --port 8000   --tensor-parallel-size 1  --max-model-len 2048 --reasoning-parser qwen3 --served-model-name qwen
 ~~~