sub 16GB and sub 8GB versions

#8
by froilo - opened

sub 16GB and sub 8GB versions would be really nice in the democratic&friendly OSS spirit.

Also a code example with
accelerate and cpu offloading in the model card would be nice
also with different text encoders (when its available)

thx

I'm hoping they make a <=2B version of the model so we can properly compare it with other existing models fairly, run on low end hardware and maybe even on mobile one day with some quantization and optimizations.

And since this is supposed to be truly open source then maybe release various epochs along the way would be nice for those planning on finetuning on top of it. Just my guess but trying to make an anime finetuned version of the model using the last epoch would probably not offer the best results but who knows.

I'm going to try to use my custom code to convert it to 8 bits which would cut the required VRAM in half. I will report back to this thread if I'm successful. 🀞

fal org

One of the potential improvements we make down the road is a smaller model. Stay tuned for updates. In the meantime, good luck with all the quantization work and let us know if you need any help!

burkaygur changed discussion status to closed

Here is my 8-bit version of the model, for those interested: ddh0/AuraFlow-8bit

Sign up or log in to comment