Hi, The model appears to be giving a score of 1 for every text-to-image zero-shot classification. I uploaded an image of a white t-shirt and tried with a text prompt
gloves
trousers For both the texts, I got a score of 1. Am I missing something? Attached are screenshots for your reference.