---
license: wtfpl
---


1. Go to [llama.cpp](https://github.com/ggerganov/llama.cpp/releases/) and download one of those folders
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f5e51289c121cb864ba464/2frhHG2gJmzgNRWiPhdIO.png)

2. If you're about to use CUDA - check the version your card supports(12.2 for any RTX) and download one of those folders
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f5e51289c121cb864ba464/tampPIs0mt6J86VHTogEf.png)

3. Unpack everything in one folder and rename it to "LlamaCPP", put this folder in the same folder where main.py/main.exe file is
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f5e51289c121cb864ba464/II3FHj0WxzT_3Zi60Us5u.png)

4. Launch main.py/main.exe file
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f5e51289c121cb864ba464/sTM3GUVMucM_AnIk8iG7H.png)