Running the Falcon-40B-Instruct model on Azure Kubernetes Service

#53
by zioproto - opened

I am successfully running this model on Azure AKS. I documented the installation in this medium article:
https://medium.com/microsoftazure/running-the-falcon-40b-instruct-model-on-azure-kubernetes-service-2c3c2fb82b5

I am using the text-generation-inference container to run the Falcon-40B-Instruct model:
https://github.com/huggingface/text-generation-inference

This is the Pod definition:
https://github.com/zioproto/kube-cheshire-cat/blob/main/kubernetes/tgi.yaml

I was not able to use the model with LangChain Tools. Is there anything I can do ? Maybe improving the prompt ?

Sign up or log in to comment