torch RuntimeError: Shape mismatch: a.size(1) = 4096, size_k = 8192

#1
by saadsafi - opened

using latest vllm (also tested vllm with latest transformers)
same error in the title happened with Intel/gemma-4-12B-it-int8-AutoRound

I will have a test next week. Based on the logs, it doesn't seem to be a model issue; it is more likely a vLLM issue.

Thanks Wen,
I'm told i should try pulling "vllm/vllm-openai:gemma4-unified" docker image (updated yesterday), apparently it caters better for new gemma4.
Thanks again

Sign up or log in to comment