Can blip2 be run at half, or lower, precision on CPU?

#29

by jamprimoz - opened Mar 5

Mar 5

Hi, when I set from_pretrained's torch_dtype=torch.float16 I get the following error back:

RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'

I've seen some other instances where this error comes up when running on the CPU, which is what I'm doing in this case. Is there a way to run this model in lower precision on the CPU?

ybelkada

Salesforce org Mar 6

Hi @jamprimoz
Some float16 operations might not be supported out of the box on CPU indeed, can you try with bfloat16 instead?

jamprimoz

Mar 6

Not only did it run, but it went from taking 14 seconds a picture to 2.5.

Thank you so very much for your help!

jamprimoz changed discussion status to closed Mar 6

jamprimoz changed discussion status to open Mar 6

jamprimoz

Mar 6

•

edited Mar 15

And of course I think of the follow up right after closing the thread :-D

Will bfloat16 work when there is a GPU involved or should I use something like:

use_dtype= "torch.float16" if torch.cuda.is_available() else "torch.bfloat16"

to set my torch_dtype if I want this to run on GPU when its available?

EDIT: just grabbed the torch dytpes directly instead of as strings

device_dtype = torch.float16 if torch.cuda.is_available() else torch.bfloat16

ybelkada

Salesforce org Mar 15

Hi @jamprimoz !
I am not sure about that but i think float16 is faster than bfloat16 on GPU indeed, so you might consider that option as well

jamprimoz

Mar 15

Thanks again!

jamprimoz changed discussion status to closed Mar 15

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment