sidebar_position: 5
slug: /deploy_local_llm
Deploy a local LLM
RAGFlow supports deploying LLMs locally using Ollama or Xinference.
Ollama
One-click deployment of local LLMs, that is Ollama.
Install
Launch Ollama
Decide which LLM you want to deploy (here's a list for supported LLM), say, mistral:
$ ollama run mistral
Or,
$ docker exec -it ollama ollama run mistral
Use Ollama in RAGFlow
- Go to 'Settings > Model Providers > Models to be added > Ollama'.
Base URL: Enter the base URL where the Ollama service is accessible, like,
http://<your-ollama-endpoint-domain>:11434
.
- Use Ollama Models.
Xinference
Xorbits Inference(Xinference) empowers you to unleash the full potential of cutting-edge AI models.
Install
To start a local instance of Xinference, run the following command:
$ xinference-local --host 0.0.0.0 --port 9997
Launch Xinference
Decide which LLM you want to deploy (here's a list for supported LLM), say, mistral.
Execute the following command to launch the model, remember to replace ${quantization}
with your chosen quantization method from the options listed above:
$ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}
Use Xinference in RAGFlow
- Go to 'Settings > Model Providers > Models to be added > Xinference'.
Base URL: Enter the base URL where the Xinference service is accessible, like,
http://<your-xinference-endpoint-domain>:9997/v1
.
- Use Xinference Models.