Unlocking the Power of locally running Llama-3 8B Model Agents with Chat-UI! ๐ฅ๐โจ
I'm thrilled to share my hackathon-style side project: 1. Finetuning Llama-8B for function calling using PEFT QLoRA as the instruct Llama-3 model doesn't support this. The colab notebook for it is here: https://lnkd.in/ggJMzqh2. ๐ ๏ธ 2. Finetuned model along with the 4-bit quants here: https://lnkd.in/gNpFKY6V โจ 3. Clone Hugging Face https://lnkd.in/gKBKuUBQ and make it compatible for function calling by building upon the PR https://lnkd.in/gnqFuAd4 for my model and local inferencing usecase using Ollama. This was a steep learning curve wherein I stayed awake the whole night to get it working. ๐ช๐ฝ 4. Above, I used SerpAPI for web browsing and Mongo DB Atlas free tier for persistence of conversations and assistant configs. ๐ 5. More work is required to switch between using tools and responding directly wherein I see the model breaks. ๐ง
How cool is this wherein we are approaching experience akin to ChatGPT while using local hosted agent model running on your laptop! ๐ป