Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

Error code 400 when deploying huggingface bigscience/bloom to SageMaker

#93
by kanikarphan - opened

Using the sample code below:
Screen Shot 2022-08-23 at 11.28.08 PM|488x500

I'm getting the following error:

{
  "code": 400,
  "type": "InternalServerException",
  "message": "\u0027bloom\u0027"
}

Any ideas on what could be causing this issue?

kanikarphan changed discussion status to closed

@kanikarphan , I'm seeing exactly the same output, how did you overcome the problem?

@yyb53 it's because BLOOM requires transformers version 4.21.0 but the inference containers offered only supports up to version 4.17.0. I ended up not using SageMaker. I went with a Serverless approach and leverage our custom container that has transformers version 4.21.0 installed. Even tho I got the BLOOM model working, it is so large its practically unusable. It's unbearably slow running any inference.

Sign up or log in to comment