AuraFlow-8bit / README.md
ddh0's picture
Update README.md
17faf42 verified
|
raw
history blame contribute delete
No virus
580 Bytes
metadata
license: apache-2.0

This is fal/AuraFlow, converted from FP16 to 8-bit. Small or one-dimensional tensors are left in FP16 so as to avoid severe degredation.

This is an experimental conversion and I don't have currently have enough memory to run this locally, so it is not guaranteed to work. Please let me know if it works or not.

In order to actually save memory usage you will need to prevent your inference engine from upcasting to FP16 during calculation.

The code used to convert the model is in the repo as well.

Enjoy!