Add zero-shot classification task for BLIP-2

#3
by youssefadarrab - opened

Is it possible to add support for zero-shot classification task using BLIP2, computing text-image similarities with the normalized embeddings, that would be accessed from BLIP2 feature extractor ?

Hi,

For that one could add get_image_features and get_text_features methods to Blip2ForConditionalGeneration. These could be implemented based on the original implementation: https://github.com/salesforce/LAVIS/blob/f982acc73288408bceda2d35471a8fcf55aa04ca/lavis/models/blip2_models/blip2_qformer.py#L387.

Feel free to open an issue on Github so this can be contributed

Hi,

I will add an issue on github, I would also love to contribute with a PR!

Sign up or log in to comment