Text Generation

Can you make a 4.65 bpw?

#1
by Samvanity - opened

a 4.65 can fit perfectly on a 4090 @ 10240 ctx without 8-bit cache. Just a thought. Thanks!

Sure I'll take a look when I have cycles available

Thanks! I'd say you can replace 4.25 with 4.65 from now on, as the target audience is the same - people with 24GB of VRAM.

Why is that? 4.25 allows for 16k context, would think people would want that one as well

Finally started it so should be up soon, will ping when it's there

Why is that? 4.25 allows for 16k context, would think people would want that one as well

Finally started it so should be up soon, will ping when it's there

Because a lot of times the finetuned of the base model cannot handle more than 8k. On jondurbin/airoboros-34b-3.2, it says:

"This is using yi-34b-200k as the base model. While the base model supports 200k context size, this model was fine-tuned with a ctx size of 8k tokens, so anything beyond that will likely have questionable results."

And thank you for the new quant! Downloading now.

Sign up or log in to comment