Pix2struct Sagemaker deployment Failing because of task Incompatibility

#5
by lalaser1899 - opened

using "from sagemaker.huggingface import HuggingFaceModel" and deploying with the defined task:

hub = {
'HF_MODEL_ID':'google/pix2struct-docvqa-base',
'HF_TASK': 'visual-question-answering'
}

Is successfully deploying, but the result is a useless endpoint for inference. The issue is that the error <"message": "A header text must be provided for VQA models."> pops up, caused by the incompatibility between "/pix2struct/image_processing_pix2struct.py" and "/pipelines/visual_question_answering.py":

File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line 1109, in call
return self.run_single(inputs, preprocess_params, forward_params, postprocess_params)

File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/visual_question_answering.py", line 117, in preprocess
image_features = self.image_processor(images=image, return_tensors=self.framework)

File "/opt/conda/lib/python3.10/site-packages/transformers/image_processing_utils.py", line 458, in call
return self.preprocess(images, **kwargs)

File "/opt/conda/lib/python3.10/site-packages/transformers/models/pix2struct/image_processing_pix2struct.py", line 390, in
ValueError: A header text must be provided for VQA models.

As the Pix2struct specific image processing function demands the image and header_text as inputs, but the standard VQA pipeline image processer only passes the image bit.

I am using:
transformers_version='4.28.1',
pytorch_version='2.0.0',
py_version='py310',

and, after deployment, calling the predictor with:

predictor.predict({
"image": "https://9to5mac.com/wp-content/uploads/sites/6/2019/04/Screen-Shot-2019-04-18-at-11.29.01-AM.png?resize=1024,746",
"question": text
})

Sign up or log in to comment