Is there a working/quantized/exl2 (etc) version that will fit on a single 24GB video card (4090)

#170
by cleverest - opened

...or am I just dreaming? I'm using Text-Generation-WebUI in Windows 11. Thank you.

@cleverest yeah just search mixtral8x7b exl2 and you should get a lot. Find something thats below 4 bpw and it will fit. If it still somehow doesnt fit, try using 8bitcache or even 4 bit cache.

Get yourself the LM studio i found it the easiest way

Sign up or log in to comment