Salesforce/instructblip-vicuna-13b · How could we load the model with low gpu memory?

erjiaxiao

22 days ago

My GPU memory is 24GB, which is not enough for the model. How could we load the model with low GPU memory?

nielsr

22 days ago

•

edited 22 days ago

Hi,

You can pass a quantization_config to the from_pretrained method in order for it to load in fewer bytes (like 4 bit or 8 bit):

from transformers import BitsAndBytesConfig, InstructBlipForConditionalGeneration

quantization_config = BitsAndBytesConfig(load_in_4bit=True)

model = InstructBlipForConditionalGeneration.from_pretrained("Salesforce/instructblip-vicuna-13b", device_map="auto", quantization_config=quantization_config)

Refer to the blog post for details: https://huggingface.co/blog/4bit-transformers-bitsandbytes

erjiaxiao

22 days ago

Thank you so much for your help!