Ragged attention supported in vLLM

#18
Mistral AI_ org

Remove max_seq_len as https://github.com/vllm-project/vllm/pull/10584 removes the need for it

Will you add interleaved_sliding_window to hf config.json as well? Are we going to use this parameter going forward?

patrickvonplaten changed pull request status to merged

Sign up or log in to comment