Marlin kernel in vLLM - new checkpoint?

#10
by zoltan-fedor - opened

Hi @casperhansen ,
I have seen your tweet about the new Marlin kernel in vLLM making AWQ models much faster:
https://x.com/casper_hansen_/status/1814952968174678517?t=uaKsxU_LLB5SDP4CNQyokQ&s=19

Also saw your comment on the related PR in GitHub: https://github.com/vllm-project/vllm/pull/6612

Based on this, are you planning to add a new checkpoint for this model to support the Marlin kernel of vLLM?

Thanks!

Sign up or log in to comment