8bit version of the model
#8
by
varun500
- opened
No description provided.
varun500
changed pull request title from
*bit version of the model
to 8bit version of the model
A 8bit version of the model would be helpful which can be loaded in 16GB of GPU VRAM
- This is a 4bit GPTQ model. I could make an 8bit GPTQ but there's no point because we can already load HF models in 8bit using
bitsandbytes
- If you want 8bit, please use https://huggingface.co/TheBloke/stable-vicuna-13B-HF and specify
load_in_8bit=True
like I told you on Github
TheBloke
changed pull request status to
closed
Sure will do that