Error deploying Aguila on AWS SageMaker

#1
by cnicu - opened

Hello!

I'm trying to deploy this model on AWS SageMaker by following the steps provided in the documentation. However, I'm encountering some errors during the endpoint creation process. I've double-checked my configurations, but the issues persist.

If anyone has experience deploying this model on AWS SageMaker or any insights into resolving similar errors, I'd greatly appreciate your help. Thanks in advance for any assistance you can offer!

Errors:

Error: DownloadError
    utils.convert_files(local_pt_files, local_st_files)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 84, in convert_files
    convert_file(pt_file, sf_file)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py", line 62, in convert_file
    save_file(pt_state, str(sf_file), metadata={
    "format": "pt"
})
  File "/opt/conda/lib/python3.9/site-packages/safetensors/torch.py", line 232, in save_file
    serialize_file(_flatten(tensors), filename, metadata=metadata)
  File "/opt/conda/lib/python3.9/site-packages/safetensors/torch.py", line 394, in _flatten
    raise RuntimeError(

And the following error:

RuntimeError: 
            Some tensors share memory, this will lead to duplicate memory on disk and potential differences when loading them again: [{'transformer.h.6.mlp.dense_4h_to_h.weight', 'transformer.h.26.mlp.dense_4h_to_h.weight', 'transformer.h.4.self_attention.query_key_value.weight', 'transformer.h.22.mlp.dense_h_to_4h.weight', 'transformer.h.23.mlp.dense_h_to_4h.weight', 'transformer.h.5.mlp.dense_h_to_4h.weight', 'transformer.h.25.mlp.dense_h_to_4h.weight', 'transformer.h.25.mlp.dense_4h_to_h.weight', 'transformer.h.0.mlp.dense_4h_to_h.weight', 'transformer.h.11.mlp.dense_h_to_4h.weight', 'transformer.h.29.self_attention.dense.weight', 'transformer.h.24.self_attention.query_key_value.weight', 'transformer.h.24.mlp.dense_h_to_4h.weight', 'transformer.h.14.mlp.dense_4h_to_h.weight', 'transformer.h.1.self_attention.dense.weight', 'transformer.h.13.mlp.dense_4h_to_h.weight', 'transformer.h.8.self_attention.query_key_value.weight', 'transformer.h.20.self_attention.query_key_value.weight', 'transformer.h.27.mlp.dense_h_to_4h.weight', 'transformer.h.22.self_attention.query_key_value.weight', 'transformer.h.11.self_attention.query_key_value.weight', 'transformer.h.23.self_attention.query_key_value.weight', 'transformer.h.13.self_attention.dense.weight', 'transformer.h.15.mlp.dense_h_to_4h.weight', 'transformer.h.9.mlp.dense_h_to_4h.weight', 'transformer.h.15.self_attention.query_key_value.weight', 'transformer.h.24.mlp.dense_4h_to_h.weight', 'transformer.h.31.self_attention.query_key_value.weight', 'transformer.h.7.self_attention.dense.weight', 'transformer.h.27.self_attention.query_key_value.weight', 'transformer.h.1.mlp.dense_h_to_4h.weight', 'transformer.h.21.mlp.dense_4h_to_h.weight', 'transformer.h.24.self_attention.dense.weight', 'transformer.h.16.mlp.dense_4h_to_h.weight', 'transformer.h.20.mlp.dense_4h_to_h.weight', 'transformer.h.27.self_attention.dense.weight', 'transformer.h.4.mlp.dense_4h_to_h.weight', 'transformer.h.3.mlp.dense_h_to_4h.weight', 'transformer.h.25.self_attention.dense.weight', 'transformer.h.7.mlp.dense_4h_to_h.weight', 'transformer.h.17.self_attention.query_key_value.weight', 'transformer.h.19.self_attention.dense.weight', 'transformer.h.12.self_attention.query_key_value.weight', 'transformer.h.3.self_attention.dense.weight', 'transformer.h.28.mlp.dense_h_to_4h.weight', 'transformer.h.19.mlp.dense_h_to_4h.weight', 'transformer.h.20.self_attention.dense.weight', 'transformer.h.14.self_attention.query_key_value.weight', 'transformer.h.21.mlp.dense_h_to_4h.weight', 'transformer.h.12.mlp.dense_h_to_4h.weight', 'transformer.h.29.mlp.dense_h_to_4h.weight', 'transformer.h.6.mlp.dense_h_to_4h.weight', 'transformer.h.14.mlp.dense_h_to_4h.weight', 'transformer.h.30.self_attention.dense.weight', 'transformer.h.10.self_attention.query_key_value.weight', 'transformer.h.6.self_attention.query_key_value.weight', 'transformer.h.10.mlp.dense_4h_to_h.weight', 'transformer.h.23.mlp.dense_4h_to_h.weight', 'transformer.h.21.self_attention.query_key_value.weight', 'transformer.h.30.self_attention.query_key_value.weight', 'transformer.h.8.mlp.dense_h_to_4h.weight', 'transformer.h.30.mlp.dense_h_to_4h.weight', 'transformer.h.18.self_attention.query_key_value.weight', 'transformer.h.5.mlp.dense_4h_to_h.weight', 'transformer.h.15.mlp.dense_4h_to_h.weight', 'transformer.h.26.self_attention.dense.weight', 'transformer.h.9.self_attention.query_key_value.weight', 'transformer.h.17.mlp.dense_4h_to_h.weight', 'transformer.h.10.mlp.dense_h_to_4h.weight', 'transformer.h.6.self_attention.dense.weight', 'transformer.h.2.mlp.dense_4h_to_h.weight', 'transformer.h.5.self_attention.dense.weight', 'transformer.h.9.mlp.dense_4h_to_h.weight', 'transformer.h.3.mlp.dense_4h_to_h.weight', 'transformer.h.17.mlp.dense_h_to_4h.weight', 'transformer.h.27.mlp.dense_4h_to_h.weight', 'transformer.h.29.self_attention.query_key_value.weight', 'transformer.h.5.self_attention.query_key_value.weight', 'transformer.h.11.self_attention.dense.weight', 'transformer.h.19.mlp.dense_4h_to_h.weight', 'transformer.h.16.mlp.dense_h_to_4h.weight', 'transformer.h.8.mlp.dense_4h_to_h.weight', 'transformer.h.30.mlp.dense_4h_to_h.weight', 'transformer.h.31.mlp.dense_h_to_4h.weight', 'transformer.h.1.mlp.dense_4h_to_h.weight', 'transformer.h.28.self_attention.dense.weight', 'transformer.h.22.mlp.dense_4h_to_h.weight', 'transformer.h.31.self_attention.dense.weight', 'transformer.h.4.mlp.dense_h_to_4h.weight', 'transformer.h.19.self_attention.query_key_value.weight', 'transformer.h.0.self_attention.dense.weight', 'transformer.h.1.self_attention.query_key_value.weight', 'transformer.h.17.self_attention.dense.weight', 'transformer.h.18.self_attention.dense.weight', 'transformer.h.23.self_attention.dense.weight', 'transformer.h.28.self_attention.query_key_value.weight', 'transformer.h.12.mlp.dense_4h_to_h.weight', 'transformer.h.16.self_attention.query_key_value.weight', 'transformer.h.22.self_attention.dense.weight', 'transformer.h.18.mlp.dense_4h_to_h.weight', 'transformer.h.2.self_attention.query_key_value.weight', 'transformer.h.18.mlp.dense_h_to_4h.weight', 'transformer.h.8.self_attention.dense.weight', 'transformer.h.12.self_attention.dense.weight', 'transformer.h.29.mlp.dense_4h_to_h.weight', 'transformer.h.10.self_attention.dense.weight', 'transformer.h.26.mlp.dense_h_to_4h.weight', 'transformer.h.31.mlp.dense_4h_to_h.weight', 'transformer.h.3.self_attention.query_key_value.weight', 'transformer.h.16.self_attention.dense.weight', 'transformer.h.9.self_attention.dense.weight', 'transformer.h.21.self_attention.dense.weight', 'transformer.h.0.self_attention.query_key_value.weight', 'transformer.h.28.mlp.dense_4h_to_h.weight', 'transformer.word_embeddings.weight', 'transformer.h.0.mlp.dense_h_to_4h.weight', 'transformer.h.4.self_attention.dense.weight', 'transformer.h.13.self_attention.query_key_value.weight', 'transformer.h.7.mlp.dense_h_to_4h.weight', 'transformer.h.2.mlp.dense_h_to_4h.weight', 'transformer.h.7.self_attention.query_key_value.weight', 'transformer.h.11.mlp.dense_4h_to_h.weight', 'transformer.h.2.self_attention.dense.weight', 'transformer.h.13.mlp.dense_h_to_4h.weight', 'transformer.h.14.self_attention.dense.weight', 'transformer.h.15.self_attention.dense.weight', 'transformer.h.25.self_attention.query_key_value.weight', 'transformer.h.26.self_attention.query_key_value.weight', 'transformer.h.20.mlp.dense_h_to_4h.weight'}, {'transformer.h.22.input_layernorm.bias', 'transformer.h.17.input_layernorm.bias', 'transformer.h.20.input_layernorm.weight', 'transformer.h.20.input_layernorm.bias', 'transformer.h.1.input_layernorm.bias', 'transformer.h.18.input_layernorm.bias', 'transformer.h.28.input_layernorm.bias', 'transformer.h.7.input_layernorm.bias', 'transformer.h.5.input_layernorm.weight', 'transformer.h.8.input_layernorm.weight', 'transformer.h.0.input_layernorm.weight', 'transformer.h.9.input_layernorm.bias', 'transformer.h.12.input_layernorm.weight', 'transformer.h.19.input_layernorm.weight', 'transformer.h.30.input_layernorm.bias', 'transformer.h.31.input_layernorm.weight', 'transformer.h.6.input_layernorm.bias', 'transformer.h.7.input_layernorm.weight', 'transformer.h.6.input_layernorm.weight', 'transformer.ln_f.weight', 'transformer.h.5.input_layernorm.bias', 'transformer.h.13.input_layernorm.weight', 'transformer.h.13.input_layernorm.bias', 'transformer.h.30.input_layernorm.weight', 'transformer.h.19.input_layernorm.bias', 'transformer.h.18.input_layernorm.weight', 'transformer.h.16.input_layernorm.bias', 'transformer.h.27.input_layernorm.bias', 'transformer.h.21.input_layernorm.weight', 'transformer.h.14.input_layernorm.weight', 'transformer.h.16.input_layernorm.weight', 'transformer.h.10.input_layernorm.bias', 'transformer.h.25.input_layernorm.bias', 'transformer.h.23.input_layernorm.bias', 'transformer.h.29.input_layernorm.weight', 'transformer.h.11.input_layernorm.weight', 'transformer.h.3.input_layernorm.bias', 'transformer.ln_f.bias', 'transformer.h.22.input_layernorm.weight', 'transformer.h.28.input_layernorm.weight', 'transformer.h.0.input_layernorm.bias', 'transformer.h.1.input_layernorm.weight', 'transformer.h.14.input_layernorm.bias', 'transformer.h.24.input_layernorm.bias', 'transformer.h.8.input_layernorm.bias', 'transformer.h.21.input_layernorm.bias', 'transformer.h.10.input_layernorm.weight', 'transformer.h.12.input_layernorm.bias', 'transformer.h.27.input_layernorm.weight', 'transformer.h.31.input_layernorm.bias', 'transformer.h.11.input_layernorm.bias', 'transformer.h.23.input_layernorm.weight', 'transformer.h.26.input_layernorm.bias', 'transformer.h.29.input_layernorm.bias', 'transformer.h.15.input_layernorm.bias', 'transformer.h.2.input_layernorm.weight', 'transformer.h.2.input_layernorm.bias', 'transformer.h.24.input_layernorm.weight', 'transformer.h.4.input_layernorm.weight', 'transformer.h.26.input_layernorm.weight', 'transformer.h.15.input_layernorm.weight', 'transformer.h.4.input_layernorm.bias', 'transformer.h.9.input_layernorm.weight', 'transformer.h.25.input_layernorm.weight', 'transformer.h.3.input_layernorm.weight', 'transformer.h.17.input_layernorm.weight'}].
            A potential way to correctly save your model is to use `save_model`.
            More information at https://huggingface.co/docs/safetensors/torch_shared_tensors
Projecte Aina org
edited Jul 19, 2023

Hi!

The error is related to safetensors, We have uploaded safetensors, but we still having some trouble with the text-generation-inference container, so it might still fail.

Sorry for the inconvenience, We hope to fix it soon.

Projecte Aina org

Hi!

Everything should work now :-)
We have tested with v0.9.3 of the text-generation-inference.

Sorry for the delay,
Best regards,
Joan

joanllop changed discussion status to closed

Sign up or log in to comment