Not working in TGI
#3
by
angeligareta
- opened
Has someone made this instance work using a HuggingFace Model and deploying it to SageMaker? I am not able to deploy it, any help on which configuration to use would be welcome. I have triedconfig = { "HF_MODEL_ID": "01-ai/Yi-34B-Chat-4bits" }
andconfig = { "HF_MODEL_ID": "01-ai/Yi-34B-Chat-4bits", 'QUANTIZE': 'awq' }
This was an error from Sagemaker. A workaround is to generate your own dockerfile with TGI
FROM ghcr.io/huggingface/text-generation-inference:1.1.0
COPY sagemaker-entrypoint.sh entrypoint.sh
RUN chmod +x entrypoint.sh
ENTRYPOINT ["./entrypoint.sh"]
Then build it and upload it to ECR and then input that image_uri to the HuggingFaceModel
huggingface_model = HuggingFaceModel(
image_uri=custom_image_uri,
env=hub,
role=role,
)
angeligareta
changed discussion status to
closed