text generation inference not working for starcoder model

#91
by avirajsingh - opened

I deployed the star coder model using the huggingface text generation inference container (replaced the token)

docker run -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN= -d ghcr.io/huggingface/text-generation-inference:latest --model-id bigcode/starcoder --max-total-tokens 8192

After the docker container starts, the api end point does not work

curl 127.0.0.1:8080/generate \ -X POST \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \ -H 'Content-Type: application/json'

The response is 'curl: (52) Empty reply from server'

I think the api server is not started in the container. Is there anything else we have to do to get the api server working in the container? how do i debug this issue?

Reference: https://github.com/bigcode-project/starcoder#installation

Sign up or log in to comment