The model_type 'pix2struct' is not recognized. It could be a bleeding edge model, or incorrect
#2
by
Khaledelsaka
- opened
Same issue for me too
+1 same issue.
Did you guys try to run it locally by pulling the model pipeline? Say in Google Colab Notebook..
The model is just 1.5GB, so it must easily load there...
It takes four lines code...
cell1: !pip install -q transformers datasets torch > /dev/null
cell2: from transformers import pipeline
cell3: img2text = pipeline(task='image-to-text',model='google/pix2struct-textcaps-base')
cell4: img2text("/content/your_image.png")
It works, I checked it with a thumbnail of the below video
https://youtu.be/tjrdb8tdXT4