"running in exui for speed at long context" in text-generation-webui

#1
by kopal37 - opened

How, exactly?

Thanks for your work on this!

Thank the trainers of the constituent models!

Exui is a text generation GUI from the exllamav2 dev. Its quite fast: https://github.com/turboderp/exui

Load the model with 8-bit cache. This applies to ooba as well, if you use that instead. Use MinP with other options disabled, except for temperature and repetition.

I use the model in notebook mode, but you may have to manually adjust the prompt template for chat mode.

Sign up or log in to comment