Marlin kernel in vLLM - new checkpoint?

by zoltan-fedor - opened

Hi @casperhansen ,
I have seen your tweet about the new Marlin kernel in vLLM making AWQ models much faster:

Also saw your comment on the related PR in GitHub:

Based on this, are you planning to add a new checkpoint for this model to support the Marlin kernel of vLLM?


Sign up or log in to comment