4 bit version

by KnutJaegersberg - opened Jan 20

Jan 20

This is a huge download, I would like to download a 4 bit version.
I've done it with bnb with the sft version.
Could you save the weigths using double_quantization in bnb in bnb and upload them?
Transformers now supports saving weights as bitsandbytes 4 bit weights simply loading in 4 bit and then using model.save_pretrained("folder").
That way one can use the model with 48gb vram.

https://huggingface.co/KnutJaegersberg/MoMo-72B-4bit

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment