google/pix2struct-textcaps-base · The model_type 'pix2struct' is not recognized. It could be a bleeding edge model, or incorrect

Mar 28, 2023

Thanks for sharing ^^
I am getting this error "The model_type 'pix2struct' is not recognized. It could be a bleeding edge model, or incorrect" when I try any of the checkpoints for inference.

vinay-k12

Mar 30, 2023

Same issue for me too

gaussfer

May 9, 2023

+1 same issue.

Kamaljp

May 9, 2023

Did you guys try to run it locally by pulling the model pipeline? Say in Google Colab Notebook..
The model is just 1.5GB, so it must easily load there...

It takes four lines code...

cell1: !pip install -q transformers datasets torch > /dev/null

cell2: from transformers import pipeline

cell3: img2text = pipeline(task='image-to-text',model='google/pix2struct-textcaps-base')

cell4: img2text("/content/your_image.png")

It works, I checked it with a thumbnail of the below video
https://youtu.be/tjrdb8tdXT4