can not run it on V100
#2
by
CrazyAIGC
- opened
This comment has been hidden
You need to also put your model on the GPU, i.e. model.to("cuda")
may not be able to on a 16gb V100
This model requires about 30 GB of GPU RAM if you use 8 bit inference (i.e. pass load_in_8bit=True
to from_pretrained
)
So I'd recommend to check out smaller BLIP-2 variants.
I use 8 bit inference on my 32gb V100 but failed.
I've already converted the input to fp16, but this bug still occurred.AssertionError: The input data type needs to be fp16 but torch.float32 was found!
Has anyone successfully run the xxl model on the V100?
@zhouqh
i've run the opt_6.7b
variant on a 24gb a10 with fp16