SageMaker deployment example
#4
by
dbonicdev
- opened
I've been trying to deploy the model on SageMaker to run some tests on private data.
I'm able to successfully instantiate an inference endpoint, but I'm unable to invoke the predict
method.
I've tried using a document-question-answering
HF task, but that does not appear to work.
How should the image be preprocessed in order to be used at the context of the inference request?
I've tried to convert the image to Base64 and also to generate a list of tensors, but neither work.
import sagemaker
import boto3
from sagemaker.huggingface.model import HuggingFaceModel
import requests
from PIL import Image
from io import BytesIO
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
# Hub model configuration <https://huggingface.co/models>
hub = {
'HF_MODEL_ID':'mPLUG/DocOwl1.5-Chat', # model_id from hf.co/models
'HF_TASK':'document-question-answering' # NLP task you want to use for predictions
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
env=hub, # configuration for loading model from Hub
role=role, # IAM role with permissions to create an endpoint
transformers_version="4.26", # Transformers version used
pytorch_version="1.13", # PyTorch version used
py_version='py39', # Python version used
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g4dn.2xlarge"
)
image_response = requests.get("<SOME IMAGE URL>")
image = Image.open(BytesIO(image_response.content))
data = {
"question": "What type of document is this?",
"context": #WHAT PREPROCESSING IS REQUIRED FOR THE IMAGE???
}
# request
result = predictor.predict(data)
print(result)