fp16 (and maybe gguf)

#1
by trollkotze - opened

yo, why not upload fp16 so people can quant it in all formats?
also description of method would be nice, so other retards can repeat it. thx u, kind sir.

deleted

More information regarding the methodology would be very welcome if you have the time.

70b next please

Yes, many of us can't run exl2.

fp32, fp16 or gguf please

I believe this is the 6.0 BPW version judging by its size. Could we get 6.5 BPW version as well, or FP16 so we can quant it ourselves? Thanks for the model by the way!

FP16 is all we truly need, I understand requanting takes a lot of compute. If the community has FP16 the usual quantizers can take it from there.
Allows the model to run on hardware other than the latest GPU's and allows proper quanting to future formats so that the model isn't lost with time.

Sign up or log in to comment