falcon-40b-instruct error on Inference endpoint while deploying

#90
by digitalsanjeev - opened

I am getting this error while trying to deploy falcon-40B-instruct on Nvidia L40S · 4x GPUs · 192 GB

Exit code: 1. Reason: │ │\n[rank2]: │ │ prefix = 'transformer.h.0' │ │\n[rank2]: │ │ self = FlashRWLayerNorm() │ │\n[rank2]: │ │ weights = <text_generation_server.utils.weights.Weights object at │ │\n[rank2]: │ │ 0x7f114d74a050> │ │\n[rank2]: │ ╰──────────────────────────────────────────────────────────────────────────╯ │\n[rank2]: ╰──────────────────────────────────────────────────────────────────────────────╯\n[rank2]: ValueError: Number of layer norms can either be 1 or 2."},"target":"text_generation_launcher","span":{"rank":2,"name":"shard-manager"},"spans":[{"rank":2,"name":"shard-manager"}]}
{"timestamp":"2024-12-11T19:01:39.518896Z","level":"INFO","fields":{"message":"Terminating shard"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]}
{"timestamp":"2024-12-11T19:01:39.518921Z","level":"INFO","fields":{"message":"Waiting for shard to gracefully shutdown"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]}
{"timestamp":"2024-12-11T19:01:39.712681Z","level":"INFO","fields":{"message":"shard terminated"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
{"timestamp":"2024-12-11T19:01:39.719146Z","level":"INFO","fields":{"message":"shard terminated"},"target":"text_generation_launcher","span":{"rank":3,"name":"shard-manager"},"spans":[{"rank":3,"name":"shard-manager"}]}
Error: ShardCannotStart

digitalsanjeev changed discussion title from Falcon-40b-instruct error on Inference endpoint while deploying to falcon-40b-instruct error on Inference endpoint while deploying

Sign up or log in to comment