Text Generation
Transformers
PyTorch
code
gpt2
custom_code
Eval Results
text-generation-inference
Inference Endpoints

Deploying bigcode/santacoder on sagemaker gives an error

#28
by kanandk - opened

It gives the following error:

An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "Loading /.sagemaker/mms/models/bigcode__santacoder requires you to execute the configuration file in that repo on your local machine. Make sure you have read the code there to avoid malicious use, then set the option trust_remote_code\u003dTrue to remove this error."
}

Source:

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()

Hub Model configuration. https://huggingface.co/models

hub = {
'HF_MODEL_ID':'bigcode/santacoder',
'HF_TASK':'text-generation'
}

create Hugging Face Model Class

huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,

)

print("Deploying...")

deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.p3.2xlarge', # ec2 instance type
trust_remote_code=True
)

And then:

predictor.predict({
"inputs": "function helloWorld "
})

At this point it gives the above error

Sign up or log in to comment