LoneStriker/Aurora-Nights-103B-v1.0-3.5bpw-h6-exl2

Dec 29, 2023

Hello,

I've been using your 5.0bpw exl2 model of the previous 103b model from sophosympatheia, Rogue-Rose-103b-v0.2-5.0bpw-h6-exl2-2. It has been performing really well. Now I've been testing Aurora-Nights-103B-v1.0 using the Q5 GGUF from TheBloke, and I'm really impressed with its output, but I've always had huge issues with tk/s performance using GGUF. And your exl2 models are always so good, fast, and easy to use.

Do you have any plans for an Aurora-Nights-103B-v1.0-5.0bpw-h6-exl2? I'm not sure if there's enough demand, not sure how many people used Rogue-Rose-103b-v0.2-5.0bpw-h6-exl2-2, but I did, and love it.

Thanks!

LoneStriker

Owner Dec 29, 2023

I can add it to the list when doing 103B models. Very few people have 48 GB VRAM, let alone more than 48 GB to run something like this at 5.0bpw (I'll be able to test these locally as well shortly after I shuffle my GPUs around.)

coffeedean

Dec 29, 2023

Thank you so much, I appreciate it! Yes, the only way I run those big models at higher quants is using something like Runpod. But it does make a noticeable difference in output quality, at least it for me did for Rogue Rose.

LoneStriker

Owner Dec 29, 2023

It's up here: https://huggingface.co/LoneStriker/Aurora-Nights-103B-v1.0-5.0bpw-h6-exl2

coffeedean

Dec 29, 2023

Thank you! You're awesome. I'm testing it right now and it's working very well. Thanks again!

coffeedean changed discussion status to closed Dec 29, 2023

LoneStriker
/

Aurora-Nights-103B-v1.0-3.5bpw-h6-exl2

5.0bpw exl2?