4 bit version

#9
by KnutJaegersberg - opened

This is a huge download, I would like to download a 4 bit version.
I've done it with bnb with the sft version.
Could you save the weigths using double_quantization in bnb in bnb and upload them?
Transformers now supports saving weights as bitsandbytes 4 bit weights simply loading in 4 bit and then using model.save_pretrained("folder").
That way one can use the model with 48gb vram.

https://huggingface.co/KnutJaegersberg/MoMo-72B-4bit

Sign up or log in to comment