---
language:
- sr
license: mit
tags:
- text-generation-inference
- transformers
- mistral
base_model: datatab/Yugo55-GPT-v4
datasets:
- datatab/alpaca-cleaned-serbian-full
- datatab/open-orca-slim-serbian
---
# Yugo55-GPT-v4-4bit
- **Developed by:** datatab
- **License:** mit
- **Quantized from model :** datatab/Yugo55-GPT-v4
## 🧩 Configuration
```yaml
models:
- model: datatab/Serbian-Mistral-Orca-Slim-v1
parameters:
weight: 1.0
- model: mlabonne/AlphaMonarch-7B
parameters:
weight: 1.0
- model: datatab/YugoGPT-Alpaca-v1-epoch1-good
parameters:
weight: 1.0
merge_method: linear
dtype: float16
```
## 🏆 Results
> Results obtained through the Serbian LLM evaluation, released by Aleksa Gordić: [serbian-llm-eval](https://github.com/gordicaleksa/serbian-llm-eval)
> * Evaluation was conducted on a 4-bit version of the model due to hardware resource constraints.
MODEL |
ARC-E |
ARC-C |
Hellaswag |
BoolQ |
Winogrande |
OpenbookQA |
PiQA |
*Yugo55-GPT-v4-4bit |
51.41 |
36.00 |
57.51 |
80.92 |
65.75 |
34.70 |
70.54 |
Yugo55A-GPT |
51.52 |
37.78 |
57.52 |
84.40 |
65.43 |
35.60 |
69.43 |
## 💻 Usage
```terminal
!pip -q install git+https://github.com/huggingface/transformers # need to install from github
!pip install -q datasets loralib sentencepiece
!pip -q install bitsandbytes accelerate
```
```python
from IPython.display import HTML, display
def set_css():
display(HTML('''
'''))
get_ipython().events.register('pre_run_cell', set_css)
```
```python
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(
"datatab/Yugo55-GPT-v4-4bit", torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"datatab/Yugo55-GPT-v4-4bit", torch_dtype="auto"
)
```
```python
from transformers import TextStreamer
def generate(question="", input="Odgovaraj uvek na Srpskom jeziku!!!"):
alpaca_prompt = """Ispod je uputstvo koje opisuje zadatak, upareno sa unosom koji pruža dodatni kontekst. Napišite odgovor koji na odgovarajući način kompletira zahtev.
### Uputstvo:
{}
### Unos:
{}
### Odgovor:
{}"""
inputs = tokenizer(
[
alpaca_prompt.format(
question, # instruction
input, # input
"", # output - leave this blank for generation!
)
],
return_tensors="pt",
).to("cuda")
text_streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
_ = model.generate(
**inputs,
streamer=text_streamer,
max_new_tokens=1024,
temperature=0.1,
repetition_penalty=1.11,
top_p=0.92,
top_k=1,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
do_sample=True,
use_cache=True
)
```
```python
generate("Nabroj mi sve planete suncevog sistemai reci mi koja je najveca planeta")
```
```python
generate("Koja je razlika između lame, vikune i alpake?")
```
```python
generate("Napišite kratku e-poruku Semu Altmanu dajući razloge za GPT-4 otvorenog koda")
```