30b 4bit lora please

#1
by Gumibit - opened

Hi, It would be great if we could have the 4bit flavor of this

thank you.

Hey, I'll work on it.
Currently I am benchmarking the models. The ones trained wo LoRA have much better performance, maybe load them in 8bit instead? medalpaca-7b quantized for inference should outperform the 30b model.

In the meantime please raise an issue on GitHub, so I won't forget to do it.

Thank you, I will try your suggestions

Gumibit changed discussion status to closed

Sign up or log in to comment