why so size?

#1
by eurotaku - opened

Hey, thanks for providing the FP16 fixes for those models. What intrigues me, though, is why they are as large as the full FP32 precision models and not more in the area of the official BF16 versions? I took the liberty of uploading them to civitai, btw, if that's ok? :)

The FP16 fix only means you can use it with FP16 AMP (which means you use fp16 to do computation and have fp16 hidden states)
not means you use FP16 to save weight.

For original model, you can even use fp8 to store the weight, just need to use BF16 to do computation.

So I provide fp32 and fp16 weight version. Both can be used at FP16 computation.

KBlueLeaf changed discussion status to closed

Sign up or log in to comment