File size: 7,407 Bytes
08a3624
 
f5c09e1
 
 
 
 
 
 
 
 
 
 
 
 
08a3624
f5c09e1
e062dc0
f5c09e1
 
c2ed1c0
 
 
c574dc3
6a04d57
008cc5b
 
200742f
008cc5b
f5c09e1
 
 
 
 
e062dc0
f5c09e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c9a5e49
f5c09e1
ca72e6f
f5c09e1
5d96f23
 
 
 
 
 
 
 
 
492d4b5
5d96f23
 
f5c09e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bd70719
f5c09e1
 
5d96f23
f5c09e1
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
license: apache-2.0
language:
- en
- de
library_name: transformers
pipeline_tag: text-generation
tags:
- finetune
- sft
- dpo
- laser
- augmentation
- german
- english
---

![SauerkrautLM]("https://vago-solutions.de/wp-content/uploads/2024/02/Sauerkraut_Laserchat.png" "SauerkrautLM-7b-LaserChat")
## VAGO solutions SauerkrautLM-7b-LaserChat
Introducing **SauerkrautLM-7b-LaserChat** – our Sauerkraut version of the powerful [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106)  !

The model "SauerkrautLM-7b-LaserChat" is a **joint effort** between **VAGO solutions** and **Hyperspace.ai.** 
Much appreciation goes to the tremendous research effort of **Fernando Fernandes Neto, David Golchinfar and Eric Hartford on their laserRMT approach.** 
Without their independent research collaboration this model release would not have been possible. 

- Fintuned with **SFT**
- Aligned with **DPO**
- **Using a novel training technique**  -  we partially freeze the model according to a laser-like analysis (yet to be officially announced). It allows to evaluate the no free lunch theorem and make a better choice to optimize it - created by the [LaserRMT research group](https://github.com/cognitivecomputations/laserRMT)
- Optimized with **LaserRMT**

# Table of Contents
1. [Overview of all SauerkrautLM-7b-LaserChat models](#all-sauerkrautlm-7b-laserchat-models)
2. [Model Details](#model-details)
   - [Prompt template](#prompt-template)
   - [Training procedure](#proceed-of-the-training)
3. [Evaluation](#evaluation)
5. [Disclaimer](#disclaimer)
6. [Contact](#contact)
7. [Collaborations](#collaborations)
8. [Acknowledgement](#acknowledgement)


## All SauerkrautLM-7b-LaserChat Models

| Model | HF    | GPTQ  | GGUF  | AWQ  |
|-------|-------|-------|-------|-------|
| SauerkrautLM-7b-LaserChat  | [Link](https://huggingface.co/VAGOsolutions/SauerkrautLM-7b-LaserChat) | coming soon | coming soon | coming soon |

## Model Details
**SauerkrautLM-7b-LaserChat**
- **Model Type:** SauerkrautLM-7b-LaserChat is a finetuned Model based on [openchat/openchat-3.5-0106](https://huggingface.co/openchat/openchat-3.5-0106) 
- **Language(s):** German, English
- **License:** Apache 2.0
- **Contact:** [VAGO solutions](https://vago-solutions.de/#Kontakt), [Hyperspace.computer](https://hyperspace.computer/)

### Training procedure:

Anyone who has attempted or succeeded in fine-tuning a model is aware of the difficulty in nudging it towards a specific skill, such as mastering new languages, as well as the challenges associated with achieving significant improvements in performance.
Experimenting with a novel training strategy and Spherical Linear Interpolation alongside a lasered version of the model itself has proven to be both fascinating and revealing.

Furthermore, we developed one iteration of the model using our entire SFT -Sauerkraut dataset and two additional iterations using subsets of the full dataset—one focused on enhancing MMLU and TQA capabilities, and the other on boosting GSM8K and Winogrande skills.

After optimizing our primary SFT model, we applied a similar strategy to our new DPO Dataset, dividing it into further subsets. We trained one model on the entire dataset again and two more on these specialized subsets.

Actively monitoring and intervening based on a decrease in perplexity on the gsm8k benchmark, led to an overall improvement in performance, especially in math abilities, without detracting from performance on other benchmarks—a task that is typically quite difficult.

This process not only helps in understanding the effectiveness of Spherical Linear Interpolation but also introduces a new method for refining models with enhanced skills through a cycle of targeted data selection (Laser data(x)) + SLERP, followed by a subsequent focus on different data (Laser again on data(y)).

Additionally, we integrated a novel training strategy on the SFT and DPO training process, where we partially freeze the model according to a laser-like analysis aiming to navigate and optimize the trade-offs highlighted by the no free lunch theorem. This innovative training method effectively prevents the significant problem of forgetting previously acquired knowledge. This aspect is particularly crucial when attempting to teach the model specific skills, such as a new language, where traditionally, the model might lose a considerable amount of its prior knowledge and exhibit a decline in overall intelligence. Concrete information on how the new training strategy works and the advantages it offers over conventional training methods will soon be published in a detailed paper by the LaserRMT research group.


We improved the German language skills on this model. Nevertheless, certain formulations may occur that are not entirely correct.


### Prompt Template:
```
GPT4 Correct User: Hallo, wie geht es dir?<|end_of_turn|>GPT4 Correct Assistant: Hallo! Ich bin ein künstliches Intelligenzsystem und habe keine persönlichen Gefühle oder körperliche Zustände. Wie kann ich Ihnen helfen?<|end_of_turn|>GPT4 Correct User: Ich benötige nur einen kurzen Satz, den ich in das Prompt Template veröffentlichen kann.<|end_of_turn|>GPT4 Correct Assistant:


```
*Prompt Example on Temp 0.3 and top_p 0.9

```
GPT4 Correct User: Hello<|end_of_turn|>GPT4 Correct Assistant: Hello! How can I help you today? If you have any questions or need assistance, feel free to ask.<|end_of_turn|>GPT4 Correct User: I just need a short sentence to post in the prompt template.<|end_of_turn|>GPT4 Correct Assistant:

```
*Prompt Example on Temp 0.3 and top_p 0.9

## Evaluation



| Metric                | Value                     |
|-----------------------|---------------------------|
| Avg.                  | 70.29 |
| ARC (25-shot)         | 67.41         |
| HellaSwag (10-shot)   | 83.57   |
| MMLU (5-shot)         | 63.91|
| TruthfulQA (0-shot)   | 56.88 |
| Winogrande (5-shot)   | 80.43  |
| GSM8K (5-shot)        | 69.52        |



## Disclaimer
We must inform users that despite our best efforts in data cleansing, the possibility of uncensored content slipping through cannot be entirely ruled out.
However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided.
Additionally, it is essential to understand that the licensing of these models does not constitute legal advice. We are not held responsible for the actions of third parties who utilize our models.
 
## Contact
If you are interested in customized LLMs for business applications, please get in contact with us via our websites. We are also grateful for your feedback and suggestions.
 
## Collaborations
We are also keenly seeking support and investment for our startups, VAGO solutions and Hyperspace where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us at [VAGO solutions](https://vago-solutions.de/#Kontakt), [Hyperspace.computer](https://hyperspace.computer/)

## Acknowledgement
Many thanks to [openchat](https://huggingface.co/openchat) for providing such valuable model to the Open-Source community