Multiple Question Input

#7
by rhachriy - opened

Hi, nielsr previously mentioned that it was possible to do multiple question inputs by "sending a batch of images + questions through the model... provide a batch of pixel_values + decoder_input_ids to the generate method, and use the batch_decode method of the tokenizer to turn the generated ID's into text."

Does anyone have an example of this or a similar notebook that details more about how to do this? Thank you.

Thank you for the response! I am running into a few peculiar items though. When running this code, I get this output.

image.png
image.png

It works just fine though if I translate it to only do one question.

image.png

Is this because of the difference between these types of images?

image.png

Thanks.

Currently you're sending the same prompt (decoder_input_ids) twice through the model. For VQA, the prompt needs to be different per example. I'll check this tomorrow

I see, sounds good! If there is any way to input multiple questions for a single image, that would be awesome as well.

I've updated the notebook to reflect this. To send multiple questions to a single image, you can just duplicate the image several times in the notebook rather than using different images.

Gotcha, thank you so much for your help!

rhachriy changed discussion status to closed

Sign up or log in to comment