Edit model card

Yugo55-GPT-v4-4bit

  • Developed by: datatab
  • License: mit
  • Quantized from model : datatab/Yugo55-GPT-v4

🧩 Configuration

models:
  - model: datatab/Serbian-Mistral-Orca-Slim-v1
    parameters:
      weight: 1.0
  - model: mlabonne/AlphaMonarch-7B
    parameters:
      weight: 1.0
  - model: datatab/YugoGPT-Alpaca-v1-epoch1-good
    parameters:
      weight: 1.0
merge_method: linear
dtype: float16

🏆 Results

Results obtained through the Serbian LLM evaluation, released by Aleksa Gordić: serbian-llm-eval

  • Evaluation was conducted on a 4-bit version of the model due to hardware resource constraints.
MODEL ARC-E ARC-C Hellaswag BoolQ Winogrande OpenbookQA PiQA
*Yugo55-GPT-v4-4bit 51.41 36.00 57.51 80.92 65.75 34.70 70.54
Yugo55A-GPT 51.52 37.78 57.52 84.40 65.43 35.60 69.43

💻 Usage

!pip -q install git+https://github.com/huggingface/transformers # need to install from github
!pip install -q datasets loralib sentencepiece
!pip -q install bitsandbytes accelerate
from IPython.display import HTML, display

def set_css():
  display(HTML('''
  <style>
    pre {
        white-space: pre-wrap;
    }
  </style>
  '''))
get_ipython().events.register('pre_run_cell', set_css)
import torch
import transformers
from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "datatab/Yugo55-GPT-v4-4bit", torch_dtype="auto"
)

tokenizer = AutoTokenizer.from_pretrained(
    "datatab/Yugo55-GPT-v4-4bit", torch_dtype="auto"
)

from transformers import TextStreamer


def generate(question="", input="Odgovaraj uvek na Srpskom jeziku!!!"):
    alpaca_prompt = """Ispod je uputstvo koje opisuje zadatak, upareno sa unosom koji pruža dodatni kontekst. Napišite odgovor koji na odgovarajući način kompletira zahtev.
  
  ### Uputstvo:
   {}
  ### Unos:
   {}
  ### Odgovor:
   {}"""

    inputs = tokenizer(
        [
            alpaca_prompt.format(
                question,  # instruction
                input,  # input
                "",  # output - leave this blank for generation!
            )
        ],
        return_tensors="pt",
    ).to("cuda")

    text_streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
    _ = model.generate(
        **inputs,
        streamer=text_streamer,
        max_new_tokens=1024,
        temperature=0.1,
        repetition_penalty=1.11,
        top_p=0.92,
        top_k=1,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
        do_sample=True,
        use_cache=True
    )
generate("Nabroj mi sve planete suncevog sistemai reci mi koja je najveca planeta")
generate("Koja je razlika između lame, vikune i alpake?")
generate("Napišite kratku e-poruku Semu Altmanu dajući razloge za GPT-4 otvorenog koda")
Downloads last month
21
Safetensors
Model size
3.86B params
Tensor type
F32
·
FP16
·
U8
·

Quantized from

Datasets used to train datatab/Yugo55-GPT-v4-4bit