can not run it on V100

#2
by CrazyAIGC - opened
This comment has been hidden

You need to also put your model on the GPU, i.e. model.to("cuda")

may not be able to on a 16gb V100

This model requires about 30 GB of GPU RAM if you use 8 bit inference (i.e. pass load_in_8bit=True to from_pretrained)

So I'd recommend to check out smaller BLIP-2 variants.

I use 8 bit inference on my 32gb V100 but failed.
I've already converted the input to fp16, but this bug still occurred.
AssertionError: The input data type needs to be fp16 but torch.float32 was found!
Has anyone successfully run the xxl model on the V100?

@zhouqh
Can you share with us the full traceback?

@zhouqh i've run the opt_6.7b variant on a 24gb a10 with fp16

@ybelkada
The traceback is very long and I only screenshot the head and the tail
1.png
5.png

@zhouqh
Can you share with us the full traceback?

@zhouqh
Thanks! Can you try to update your bnb version? pip install --upgrade bitsandbytes

Sign up or log in to comment