Can VLLM be used for inference acceleration?

#2
by obtion - opened

Can VLLM be used for inference acceleration?

Owner

"architectures": [
"MixtralForCausalLM"
],
you need to check whether vllm support "MixtralForCausalLM"

yeah VLLM supports that

Sign up or log in to comment