Does it run on a CPU instance in sagemaker (ml.m5.2xlarge)?

#2
by arviii - opened

Hey, I am trying to deploy a model on a CPU instance(ml.m5.2xlarge) on sagemaker, but it overflows the storage and best way to resolve this might be to mount a storage volume (EBS I suppose)

To do so, ideally should pass volume_size=80 in huggingface_model.deploy parameters. But it doesn't seem to work in my case and it still throws same error about storage running out.

Model: https://huggingface.co/NumbersStation/nsql-llama-2-7B
Instance: ml.m5.2xlarge (it works perfectly fine on ml.g5.2xlarge)

error: "Error: Download 
Error safetensors_rust.SafetensorError: Error while serializing: IoError(Os { code: 28, kind: StorageFull, message: ""No space left on device"" })"
code: predictor = huggingface_model.deploy(
    initial_instance_count=1,
    # instance_type="ml.g5.2xlarge",
    instance_type="ml.m5.2xlarge",
    container_startup_health_check_timeout=300,
    volume_size=80,
)

rest code is just fine as it gets deployed successfully on ml.g5.2xlarge

NumbersStation org

Thank you for sharing this information! It will be helpful for others who are interested in deploying on Sagemaker.

Sign up or log in to comment