Seq len

#1
by Hypersniper - opened

So just to clarify, if the seq len of the quantized model shows 8k that means I can't use the full 16k? What should I set my max token and truncate settings to? (Text generation webui)

You can use it at 16K. Please see details under "Explanation of GPTQ parameters"

Sign up or log in to comment