Deploy with SageMaker

#15

by Larissa-Stallion - opened Jul 11

Jul 11

When following the instructions under Deploy --> Amazon SageMaker --> SageMaker SDK --> deploy.py

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

models

hub = {
'HF_MODEL_ID':'Snowflake/snowflake-arctic-embed-m-long'
}

create Hugging Face Model Class

huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface-tei",version="1.2.3"),
env=hub,
role=role,
)

deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
)

send request

predictor.predict({
"inputs": "My name is Clara and I am",
})
I receive the error:

UnexpectedStatusException: Error hosting endpoint tei-2024-07-10-22-05-53-662: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.. Try changing the instance type or reference the troubleshooting page https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference-troubleshooting.html

Within the CloudWatch logs I found the error:
Error: Model backend is not healthy
Caused by:
unexpected rank, expected: 2, got: 1 ([768])

I was able to successfully create a SageMaker Endpoint for Snowflake/snowflake-arctic-embed-l, but require this long-context variant. Please let me know how to overcome this error.

spacemanidol

Snowflake org Jul 18

you need to have sagemaker support trust remote code https://huggingface.co/tiiuae/falcon-7b-instruct/commit/777a465507c47b7c7377c6bff3fb783ee81dd787

Larissa-Stallion

Jul 18

I installed the model locally and modified the config.json by adding "trust_remote_code": true
"torch_dtype": "float32",
"transformers_version": "4.36.1",
"trust_remote_code": true,
"type_vocab_size": 2,
"use_cache": true,
"use_flash_attn": true,
"use_rms_norm": false,
"use_xentropy": true,
"vocab_size": 30528
}
I than compressed it into a tar.gz following the instructions here: https://huggingface.co/docs/sagemaker/inference#create-a-model-artifact-for-deployment

I was able to create the SageMaker endpoint:

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

trust_remote_code = True

hub = {
'HF_MODEL_ID':'Snowflake/snowflake-arctic-embed-m-long',
'HF_TASK':'feature-extraction',
'HF_MODEL_TRUST_REMOTE_CODE': json.dumps(trust_remote_code)
}

huggingface_model = HuggingFaceModel(
model_data="s3://sagemaker-us-gov-west-1-077510649301/huggingface-models/snowflake-arctic-embed-m-long-config-mod.tar.gz", # path to your trained SageMaker model
role=role, # IAM role with permissions to create an endpoint
transformers_version="4.26", # Transformers version used
pytorch_version="1.13", # PyTorch version used
py_version='py39', # Python version used
env=hub,
)

predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
endpoint_name="snowflake-arctic-embed-m-long",
)

However, I get a trust remote code error when trying to use the endpoint:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "Loading /.sagemaker/mms/models/Snowflake__snowflake-arctic-embed-m-long requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code\u003dTrue to remove this error."
}
". See https://us-gov-west-1.console.aws.amazon.com/cloudwatch/home?region=us-gov-west-1#logEventViewer:group=/aws/sagemaker/Endpoints/iproposal-sandbox-embedding-snowflake-arctic-embed-m-long in account 077510649301 for more information.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment