Error Deploying on SageMaker

#12
by wamozart - opened

Hi,

Followed the deploy on sagemaker SDK code giving.
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel

try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

Hub Model configuration. https://huggingface.co/models

hub = {
'HF_MODEL_ID':'llava-hf/llava-v1.6-34b-hf',
'HF_TASK':'image-text-to-text'
}

create Hugging Face Model Class

huggingface_model = HuggingFaceModel(
transformers_version='4.37.0',
pytorch_version='2.1.0',
py_version='py310',
env=hub,
role=role,
)

deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.g4dn.12xlarge' # ec2 instance type
)

Endpoint deployed successfully (note I took a much larger ec2 instance). When I ret to call preidctor.predict I keep getting:
"message": "The checkpoint you are trying to load has model type llava_next but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date."

Sign up or log in to comment