"Use in Transformers" feature is wrong
#1
by
smartinezbragado
- opened
The right way to load this model is:
from transformers import Blip2Processor, Blip2ForConditionalGeneration
processor = Blip2Processor.from_pretrained("Salesforce/blip2-flan-t5-xl-coco")
model = Blip2ForConditionalGeneration.from_pretrained(
"Salesforce/blip2-flan-t5-xl-coco", torch_dtype=torch.float16
)
```
Thanks for flagging. cc
@osanseviero
do you know why AutoModelForVisualQuestionAnswering
is displayed in the "use with Transformers" UI? That class doesn't even exist in the Transformers library
It's specified in the pipeline_tags
file generated in transformers
https://huggingface.co/datasets/huggingface/transformers-metadata/blob/main/pipeline_tags.json#L57
Right here it exists-> https://github.com/huggingface/transformers/blob/main/src/transformers/models/auto/modeling_auto.py#L1482
Ok thanks, actually I think we should deprecate BLIP and BLIP-2 for that class, and move them to the new "image-text-to-text" pipeline, which is added in https://github.com/huggingface/transformers/pull/29572.