YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

vLLM Nightly Wheel for Qwen3.5

This archive contains a Linux x86_64 vLLM nightly wheel that supports the Qwen3_5ForConditionalGeneration architecture used by Qwen/Qwen3.5-4B.

Install on the cluster

tar -xzf vllm-wheel-linux-x86_64.tar.gz
python -m pip install --no-index --no-deps --force-reinstall \
  ./vllm-0.23.1rc1.dev102+ga46abb7ae-cp38-abi3-manylinux_2_28_x86_64.whl
python -c "import vllm; print(vllm.__version__)"

Serve Qwen3.5

CUDA_VISIBLE_DEVICES=1,2 vllm serve /cm/archive/tue09/model_hub/Qwen/Qwen3.5-4B \
  --port 8000 \
  --served-model-name Qwen/Qwen3.5-4B \
  --tensor-parallel-size 2 \
  --max-model-len 8192 \
  --reasoning-parser qwen3 \
  --language-model-only
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support