Text Generation
Transformers
Safetensors
Serbian
mistral
mergekit
Merge
text-generation-inference
conversational
Inference Endpoints
File size: 2,455 Bytes
77bdbe2
 
 
 
 
 
 
 
 
 
 
 
e8aefdb
 
 
 
 
 
 
 
 
 
fa2531f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
193b167
fa2531f
 
 
 
 
 
 
 
 
193b167
fa2531f
a6ddf72
fa2531f
 
e8aefdb
 
d2fdfac
e8aefdb
77bdbe2
d2fdfac
 
77bdbe2
 
 
 
 
 
 
 
 
e8aefdb
77bdbe2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
base_model:
- mlabonne/AlphaMonarch-7B
- datatab/Yugo55-GPT-v4
- datatab/Yugo55-GPT-DPO-v1-chkp-300
- NousResearch/Nous-Hermes-2-Mistral-7B-DPO
library_name: transformers
tags:
- mergekit
- merge

---
# # Yugo55A-GPT

- **Developed by:** datatab
- **License:** mit


## 🏆 Results 
> Results obtained through the Serbian LLM evaluation, released by Aleksa Gordić: [serbian-llm-eval](https://github.com/gordicaleksa/serbian-llm-eval)
> * Evaluation was conducted on a 4-bit version of the model due to hardware resource constraints.

<table>
  <tr>
    <th>MODEL</th>
    <th>ARC-E</th>
    <th>ARC-C</th>
    <th>Hellaswag</th>
    <th>BoolQ</th>
    <th>Winogrande</th>
    <th>OpenbookQA</th>
    <th>PiQA</th>
  </tr>
  <tr>
    <td><a href="https://huggingface.co/datatab/Yugo55-GPT-v4-4bit/">Yugo55-GPT-v4-4bit</a></td>
    <td>51.41</td>
    <td>36.00</td>
    <td>57.51</td>
    <td>80.92</td>
    <td><strong>65.75</strong></td>
    <td>34.70</td>
    <td><strong>70.54</strong></td>
  </tr>
  <tr>
    <td><a href="https://huggingface.co/datatab/Yugo55A-GPT/">Yugo55A-GPT</a></td>
    <td><strong>51.52</strong></td>
    <td><strong>37.78</strong></td>
    <td><strong>57.52</strong></td>
    <td><strong>84.40</strong></td>
    <td>65.43</td>
    <td><strong>35.60</strong></td>
    <td>69.43</td>
  </tr>
</table>


# 🔗 Merge Details

### Merge Method
> This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
> This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method.

### Models Merged

The following models were included in the merge:
* [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
* [datatab/Yugo55-GPT-v4](https://huggingface.co/datatab/Yugo55-GPT-v4)
* [datatab/Yugo55-GPT-DPO-v1-chkp-300](https://huggingface.co/datatab/Yugo55-GPT-DPO-v1-chkp-300)
* [NousResearch/Nous-Hermes-2-Mistral-7B-DPO](https://huggingface.co/NousResearch/Nous-Hermes-2-Mistral-7B-DPO)

## 🧩 Configuration

The following YAML configuration was used to produce this model:

```yaml

models:
  - model: datatab/Yugo55-GPT-v4
    parameters:
      weight: 1.0
  - model: datatab/Yugo55-GPT-DPO-v1-chkp-300
    parameters:
      weight: 1.0
  - model: mlabonne/AlphaMonarch-7B
    parameters:
      weight: 0.5
  - model: NousResearch/Nous-Hermes-2-Mistral-7B-DPO
    parameters:
      weight: 0.5
merge_method: linear
dtype: float16

```