bartowski
/

airoboros-34b-3.2-exl2

Text Generation

Model card Files Files and versions Community

Can you make a 4.65 bpw?

#1

by Samvanity - opened Mar 6, 2024

Mar 6, 2024

a 4.65 can fit perfectly on a 4090 @ 10240 ctx without 8-bit cache. Just a thought. Thanks!

Owner Mar 6, 2024

Sure I'll take a look when I have cycles available

Mar 7, 2024

Thanks! I'd say you can replace 4.25 with 4.65 from now on, as the target audience is the same - people with 24GB of VRAM.

Owner Mar 7, 2024

Why is that? 4.25 allows for 16k context, would think people would want that one as well

Finally started it so should be up soon, will ping when it's there

Owner Mar 7, 2024

@Samvanity it's up: https://huggingface.co/bartowski/airoboros-34b-3.2-exl2/tree/4_65

Mar 7, 2024

Why is that? 4.25 allows for 16k context, would think people would want that one as well

Finally started it so should be up soon, will ping when it's there

Because a lot of times the finetuned of the base model cannot handle more than 8k. On jondurbin/airoboros-34b-3.2, it says:

"This is using yi-34b-200k as the base model. While the base model supports 200k context size, this model was fine-tuned with a ctx size of 8k tokens, so anything beyond that will likely have questionable results."

And thank you for the new quant! Downloading now.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment