Request for quantized version

by sudhir2016 - opened

A quantized version of the model which can be used for inference in a free tier Google Colab notebook would be nice.

MaLA-LM org

will you be able to use HF's integration such as bitsandbytes (

Yes please. Will it work with load_in_4bit=True.

Sign up or log in to comment