What is the input payload for llava predictor on sagemaker, (KeyError: 'input_ids'. aws sagemaker and llava 1.6)

by mujammil - opened Jun 11, 2024

Jun 11, 2024

i am trying to deploy this modal on sagemaker something like this

sagemaker_session_bucket = None
if sagemaker_session_bucket is None and sagemakerSession is not None:
    sagemaker_session_bucket = sagemakerSession.default_bucket()
    
hub = {
  'HF_MODEL_ID':'llava-hf/llava-v1.6-vicuna-13b-hf', # model_id from hf.co/models
  'HF_TASK':'image-to-text',             # NLP task you want to use for predictions
  'HF_MODEL_QUANTIZE':'true'
}
image_uri="custom_image_uri_with_upgraded_transformers"
instance_type="ml.p2.xlarge"

from sagemaker.huggingface.model import HuggingFaceModel
huggingface_model = HuggingFaceModel(
    image_uri=image_uri,
    env=hub,
    role=role, 
)

and i am assuming this is how we are going to predict


image = "https://llava-vl.github.io/static/images/view.jpg"


predictor.predict(data={
    "inputs":image,
    "prompt":"what is this image?"
})

which causes to throw the KeyError: 'input_idserror on the cloudwatch . below are the complete error log.

"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 258, in handle response = self.transform_fn(*([self.model, input_data,
content_type, accept] + self.transform_extra_arg)) File
"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 214, in transform_fn predictions = self.predict(*([processed_data, model] +
self.predict_extra_arg)) File
"/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py",
line 178, in predict prediction = model(inputs) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/image_to_text.py",
line 125, in __call__ return super().__call__(images, **kwargs) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line
1243, in __call__ return self.run_single(inputs, preprocess_params,
forward_params, postprocess_params) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py", line
1250, in run_single model_outputs = self.forward(model_inputs, **forward_params)
File "/opt/conda/lib/python3.10/site-packages/transformers/pipelines/base.py",
line 1150, in forward model_outputs = self._forward(model_inputs,
**forward_params) File
"/opt/conda/lib/python3.10/site-packages/transformers/pipelines/image_to_text.py",
line 180, in _forward inputs = model_inputs.pop(self.model.main_input_name) File
"/opt/conda/lib/python3.10/_collections_abc.py", line 962, in pop value =
self[key] File "/opt/conda/lib/python3.10/collections/__init__.py", line 1106,
in __getitem__ raise KeyError(key) KeyError: 'input_ids'

i am assuming i am sending the wrong payload. i have research but i couldnt find the exact payload. but i might be wrong. any help would be appreciated

mujammil

Jun 11, 2024

also tried this payload


image = "https://llava-vl.github.io/static/images/view.jpg"


predictor.predict(data={
    "inputs":image,
    "prompt":"[INST] <image>\nWhat is shown in this image? [/INST]"
})

same error as above

nielsr

Llava Hugging Face org Jun 11, 2024

Hi, the "image-to-text" task is not yet officially supported on Sagemaker. We plan to move models like LLaVa (and LLaVa-NeXT, Idefics2, PaliGemma,...) to the "image-text-to-text" task, for which APIs are currently being standardized. For now you'll need a custom deployment on SageMaker (a deployement out-of-the-box is not supported yet).

mujammil

Jun 12, 2024

ahh i see, but i think image-text-to-text is not yet supported right? i tried with that task, and it threw error saying that it is not supported, if it is supported, can you share the documentation for it's implementation

RaushanTurganbay

Llava Hugging Face org Jun 12, 2024

@mujammil "image-text-to-text" pipeline is not yet added to transformers. I cannot say about the exact timeline for adding it, probably a couple months

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment