How to enable longer sequence length?

#5
by frankxyy - opened

image.png

I got this error message with vllm v0.2.6 when the input length is longer than 2048. How to enable longer sequence length?

This comment has been hidden

add max_seq_len parameter when initializing the model via AutoAWQForCausalLM like this

model = AutoAWQForCausalLM.from_quantized(quant_path, fuse_layers=True, max_seq_len=8192)

now your model has the context length of 8K. 2K is the by default length. Hope this helps.

Sign up or log in to comment