Memory Spikes while Getting Model Logits

#49
by Nyandwi - opened

Hello, thanks for this amazing visual language model.

I am having memory issues while forwarding the inputs to the model. The generate functionality works fine and I can run it multiple times. But when trying to get the logits with model(**inputs), I run out of memory. I have 48GB GPU RAM which is reasonably enough according to other discussions about devices. Is there something I am missing?

model_id = "adept/fuyu-8b"
processor = FuyuProcessor.from_pretrained(model_id)
model = FuyuForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)

inputs = processor(text=prompt, images=sample_im_1, return_tensors="pt").to("cuda:0")
outputs =  model(**inputs)

Thanks!

Hello @Nyandwi ! What image sizes are you working with? Perhaps you could downscale them to a max height of 1080 before processing? 48 GB of GPU RAM should be plenty for the model.

Hi, @pcuenq . Thanks for your reply, really appreciate the support. The images have variable size, but some of them are probably greater than 1080. I will ensure the images are resized to that resolution and see if that solve the problem.

Sign up or log in to comment