Sagemaker deployment error

#4
by Hullabaloo - opened

How to import tokenizer for sagemaker to deploy model?

ValueError: Tokenizer class SEABPETokenizer does not exist or is not currently imported.

AI Singapore org

Hi,

Thank you for your interest in SEA-LION.
The SEA-LION tokenizer and model requires code execution and therefore requires the trust_remote_code to be set to True.

I do not have much experience with Sagemaker, however there is a passage in the documentation which might be relevant.
https://sagemaker.readthedocs.io/en/stable/overview.html#deploy-foundation-models-to-sagemaker-endpoints

For gated models on Hugging Face Hub, request access and pass the associated key as the environment variable HUGGING_FACE_HUB_TOKEN. Some Hugging Face models may require trusting of remote code, so set HF_TRUST_REMOTE_CODE as an environment variable.

Could you kindly set the HF_TRUST_REMOTE_CODE environment variable to True and see if this fixes your issue?
Thank you
Raymond

Hi Raymond,

Thank you for your quick support on my query!

I have tried your suggestion and attempted the documentation you provided but the sagemaker endpoint to sea lion still doesn't seem to establish. Perhaps these logs might be useful? It looks like the trust_remote_code is still false...

This is my model hub configuration:

hub = {
'HF_MODEL_ID':'aisingapore/sealion7b-instruct-nc',
'SM_NUM_GPUS': json.dumps(1),
"HF_TRUST_REMOTE_CODE": "True",
"trust_remote_code": "True",
"HUGGING_FACE_HUB_TOKEN" : "read token"
}

These are the logs from sagemaker:

#033[2m2024-02-22T03:06:04.686906Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Args { model_id: "aisingapore/sealion7b-instruct-nc", revision: None, validation_workers: 2, sharded: None, num_shard: Some(1), quantize: None, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_top_n_tokens: 5, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, hostname: "container-0.local", port: 8080, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/tmp"), weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, rope_scaling: None, rope_factor: None, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false }

#033[2m2024-02-22T03:07:10.282868Z#033[0m #033[32m INFO#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Downloaded /tmp/models--aisingapore--sealion7b-instruct-nc/snapshots/eaf0b7163f8a4ce80cb2a2c8e6118f3e571f77e1/model-00002-of-00002.safetensors in 0:00:33.

#033[2m2024-02-22T03:07:16.527671Z#033[0m #033[31mERROR#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Error when initializing model

ValueError: Tokenizer class SEABPETokenizer does not exist or is not currently imported.

Error: ShardCannotStart

Hope we'll be able to get this working...

Thanks!

Hullabaloo changed discussion status to closed
Hullabaloo changed discussion status to open
AI Singapore org

Hi,

My apologies for the late follow up.
Please find attached for a sample notebook on how to deploy SEA-LION on SageMaker,
https://drive.google.com/file/d/1FLTiUGbK519N0EHQArMd0wILzgFzj6Mp/view?usp=sharing

The model hub config should be:

hub = {
  'HF_MODEL_ID':'aisingapore/sealion7b-instruct-nc',
  'SM_NUM_GPUS': json.dumps(4),
  'HF_MODEL_TRUST_REMOTE_CODE': json.dumps(True),
}

Hope this helps,
Raymond

Sign up or log in to comment