Deployment on Sagemaker endpoint with text generation inference container does not work

#30
by AnOtterDeveloper - opened

I get the following error when I try to deploy the model on Sagemaker:

Could not convert model weights to safetensors: Error while trying to find names to remove to save state dict, but found no suitable name to keep for saving amongst: {'model.norm.weight'}. None is covering the entire storage.Refusing to save/load the model since you could be storing much more memory than needed. Please refer to https://huggingface.co/docs/safetensors/torch_shared_tensors for more information. Or open an issue.

Apparently this is because of shared tensors not being supported in safetensors
Is there any way around it ?
Thanks

I am also having this issue. Anyone able to solve this?

Edit 1: I found this link saying that it does not work with TGI 1.0.3 - https://discuss.huggingface.co/t/mistral-ai-sagemaker-deployment-failing/57379/2

Edit 2: Updating to 1.1.0 worked. If you are using the AWS public ECR, they have 1.1.0 on there. (e.g. 763104351884.dkr.ecr.eu-west-1.amazonaws.com/huggingface-pytorch-tgi-inference:2.0.1-tgi1.1.0-gpu-py39-cu118-ubuntu20.04)

Sign up or log in to comment