Ragged attention supported in vLLM
#18
by
patrickvonplaten
- opened
Remove max_seq_len as https://github.com/vllm-project/vllm/pull/10584 removes the need for it
Will you add interleaved_sliding_window
to hf config.json as well? Are we going to use this parameter going forward?
patrickvonplaten
changed pull request status to
merged