YugoGPT-GGUF / README.en.md
alkibijad's picture
typos
b32cc71
---
base_model: gordicaleksa/YugoGPT
inference: false
language:
- sr
- hr
license: apache-2.0
model_creator: gordicaleksa
model_name: YugoGPT
model_type: mistral
quantized_by: Luka Secerovic
---
[![sr](https://img.shields.io/badge/lang-sr-green.svg)](https://huggingface.co/alkibijad/YugoGPT-GGUF/blob/main/README.md)
[![en](https://img.shields.io/badge/lang-en-red.svg)](https://huggingface.co/alkibijad/YugoGPT-GGUF/blob/main/README.en.md)
# About the model
[YugoGPT](https://huggingface.co/gordicaleksa/YugoGPT) is currently the best open-source base 7B LLM for BCS (Bosnian, Croatian, Serbian).
This repository contains the model in [GGUF](https://github.com/ggerganov/llama.cpp/tree/master) format, which is very useful for local inference, and doesn't require expensive hardware.
# Versions
The model is compressed into a couple of smaller versions. Compression drops the quality slightly, but significantly increases the inference speed.
It's suggested to use the `Q4_1` version as it's the fastest one.
| Name | Size (GB) | Note |
|-------|---------------|----------------------------------------------------------------------------|
| Q4_1 | 4.55 | Weights compressed to 4 bits. The fastest version. |
| q8_0 | 7.7 | Weights compressed to 8 bits. |
| fp16 | 14.5 | Weights compressed to 16 bits. |
| fp32 | 29 | Original, 32 bit weights. Not recommended to use this. |
# How to run this model locally?
## LMStudio - the easiest way ⚡️
Install [LMStudio](https://lmstudio.ai/).
- After installation, search for "alkibijad/YugoGPT":
![Pretraga](./media/lm_studio_screen_1.png "Pretraga modela")
- Choose a model version (recommended `Q4_1`):
![Izaberi model](./media/lm_studio_screen_2.1.png "Izaberi model")
- After the model finishes downloading, click on "chat" on the left side and start chatting.
- [Optional] You can setup a system prompt, e.g. "You're a helpful assistant" or however else you want.
![Chat](./media/lm_studio_screen_3.png "Chat")
That's it!
## llama.cpp - advanced 🤓
Ako si napredan korisnik i želiš da se petljaš sa komandnom linijom i naučiš više o `GGUF` formatu, idi na [llama.cpp](https://github.com/ggerganov/llama.cpp/tree/master) i pročitaj uputstva 🙂
If you're an advanced user and want to use CLI and learn more about `GGUF` format, go to [llama.cpp](https://github.com/ggerganov/llama.cpp/tree/master) and follow the instructions 🙂