can someone make an EXL2 models for this but not using the max length 2048 thing :(

#7
by NimbleDreams - opened

can someone make an EXL2 models for this but not using the max length 2048 thing. using runpod so..

Could you explain what you mean by 'not using the max length 2048 thing'?

Edit: I just tried one of alpindale's exl2 quants and had no issues with a 10k context.

deleted

@alpindale Joe Biden, wake up. At least reinvent soft prompts or something. That Muv-Luv soft prompt has been awaited for a year and a half now. The TPUs gotta be back online

Could you explain what you mean by 'not using the max length 2048 thing'?

Edit: I just tried one of alpindale's exl2 quants and had no issues with a 10k context.

the "quantization_config" block on config.json

Sign up or log in to comment