What is the input payload for llava predictor on sagemaker, (KeyError: 'input_ids'. aws sagemaker and llava 1.6)

#3
by mujammil - opened

i am trying to deploy this modal on sagemaker something like this

sagemaker_session_bucket = None
if sagemaker_session_bucket is None and sagemakerSession is not None:
    sagemaker_session_bucket = sagemakerSession.default_bucket()
    
hub = {
  'HF_MODEL_ID':'llava-hf/llava-v1.6-vicuna-13b-hf', # model_id from hf.co/models
  'HF_TASK':'image-to-text',             # NLP task you want to use for predictions
  'HF_MODEL_QUANTIZE':'true'
}
image_uri="custom_image_uri_with_upgraded_transformers"
instance_type="ml.p2.xlarge"

from sagemaker.huggingface.model import HuggingFaceModel
huggingface_model = HuggingFaceModel(
    image_uri=image_uri,
    env=hub,
    role=role, 
)

and i am assuming this is how we are going to predict


image = "https://llava-vl.github.io/static/images/view.jpg"


predictor.predict(data={
    "inputs":image,
    "prompt":"what is this image?"
})

which causes to throw the KeyError: 'input_idserror on the cloudwatch . below are the complete error log.

"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 258, in handle response = self.transform_fn(*([self.model, input_data,
content_type, accept] + self.transform_extra_arg)) File
"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 214, in transform_fn predictions = self.predict(*([processed_data, model] +
self.predict_extra_arg)) File
"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 178, in predict prediction = model(inputs) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/image_to_text.py",
line 125, in __call__ return super().__call__(images, **kwargs) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line
1243, in __call__ return self.run_single(inputs, preprocess_params,
forward_params, postprocess_params) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line
1250, in run_single model_outputs = self.forward(model_inputs, **forward_params)
File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py",
line 1150, in forward model_outputs = self._forward(model_inputs,
**forward_params) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/image_to_text.py",
line 180, in _forward inputs = model_inputs.pop(self.model.main_input_name) File
"/opt/conda/lib/python3.10/_collections_abc.py", line 962, in pop value =
self[key] File "/opt/conda/lib/python3.10/collections/__init__.py", line 1106,
in __getitem__ raise KeyError(key) KeyError: 'input_ids'

i am assuming i am sending the wrong payload. i have research but i couldnt find the exact payload. but i might be wrong. any help would be appreciated

also tried this payload


image = "https://llava-vl.github.io/static/images/view.jpg"


predictor.predict(data={
    "inputs":image,
    "prompt":"[INST] <image>\nWhat is shown in this image? [/INST]"
})

same error as above

Llava Hugging Face org

Hi, the "image-to-text" task is not yet officially supported on Sagemaker. We plan to move models like LLaVa (and LLaVa-NeXT, Idefics2, PaliGemma,...) to the "image-text-to-text" task, for which APIs are currently being standardized. For now you'll need a custom deployment on SageMaker (a deployement out-of-the-box is not supported yet).

ahh i see, but i think image-text-to-text is not yet supported right? i tried with that task, and it threw error saying that it is not supported, if it is supported, can you share the documentation for it's implementation

Llava Hugging Face org

@mujammil "image-text-to-text" pipeline is not yet added to transformers. I cannot say about the exact timeline for adding it, probably a couple months

Sign up or log in to comment