Other sizes

#1
by eramax - opened

Could you please add 2.4 and 3.0 versions

3.0 is on its way, wasn't planning a 2.4 but i'll add it as well :)

Currently with my script 3.75 is being made, then it'll do 3.5, then 3.0, then i'll manually add 2.4 at the end

Thanks, appreciated, I have 3090 with 24 GB Vram , it is OK I guess it can run 3.5, I can let u know if it didn't work so you can pospond the 2.4

Thanks

I have a 3090 as well and I found that 3.5 worked well with ~8k context, and 3.0 works well with ~16k context

if you're trying to push the full 32k you might need the 2.4, but at that point it's also possible you'll be better off with running cache in 8 bit

either way best to experiment to find the right fit :)

Thanks, I tested 3.75 and worked fine with small context.

@eramax posted 2.4, but you should know Eric pulled his model cause the training went wrong and he saw degradation in quality, you can keep playing around but just a heads up

he's starting 2.7 which will have the same data but with a fixed training method, eta 4 days

Sign up or log in to comment