YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
vLLM Nightly Wheel for Qwen3.5
This archive contains a Linux x86_64 vLLM nightly wheel that supports the
Qwen3_5ForConditionalGeneration architecture used by Qwen/Qwen3.5-4B.
Install on the cluster
tar -xzf vllm-wheel-linux-x86_64.tar.gz
python -m pip install --no-index --no-deps --force-reinstall \
./vllm-0.23.1rc1.dev102+ga46abb7ae-cp38-abi3-manylinux_2_28_x86_64.whl
python -c "import vllm; print(vllm.__version__)"
Serve Qwen3.5
CUDA_VISIBLE_DEVICES=1,2 vllm serve /cm/archive/tue09/model_hub/Qwen/Qwen3.5-4B \
--port 8000 \
--served-model-name Qwen/Qwen3.5-4B \
--tensor-parallel-size 2 \
--max-model-len 8192 \
--reasoning-parser qwen3 \
--language-model-only
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support