Thank you very much

#1
by Nexesenex - opened

After test, it works wonderfully, and thank you.
I was wondering : do you use by default the new exl2-2 quant? This, because you ceased to mention it in the title of your quants 15 days ago.

Also, about the bucket list of models to quantize that you kindly suggested, here are my favorites of which you made the measurement already :

1 - https://huggingface.co/LoneStriker/Yarn-Llama-2-70b-32k-4.0bpw-h6-exl2 -> in 3.0bpw... to enjoy 30k ctx at a good perplexity still with the new exl2-2 quant, after making the 10k first tokens with a better model.

2 - https://huggingface.co/LoneStriker/Xwin-LM-70B-V0.1-4.0bpw-h6-exl2 -> in 3.50bpw... because this base rocks.

3 - https://huggingface.co/LoneStriker/dolphin-2.2-70b-4.0bpw-h6-exl2 -> in 3.50bpw... because it's probably the best all rounder I tested.

4 - https://huggingface.co/LoneStriker/Aetheria-L2-70B-2.65bpw-h6-exl2-2 -> in 3.50bpw.. because it's my favorite merge.

5 - https://huggingface.co/LoneStriker/airoboros-l2-70b-3.1-4.0bpw-h6-exl2 -> in 3.50bpw.. because it has an amazing instruct system.

Also, if you could add to this list of model exl2 quants, I'd suggest : https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-m2.0 , for this Airoboros is an improved 1.4.1 to my taste, and keeps most of its charm. Otherwise, a 3.50bpw quant of https://huggingface.co/LoneStriker/airoboros-l2-70b-gpt4-1.4.1-4.65bpw-h6-exl2-2 would be great still as a 6. !

Once again, thank you for the LZLV. Finally I can really have the feel of a 70b model with a decent speed and a large context.

Sign up or log in to comment