DavidGF commited on
Commit
99e3272
1 Parent(s): 7a66f6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -19
README.md CHANGED
@@ -5,19 +5,26 @@ language:
5
  - de
6
  library_name: transformers
7
  pipeline_tag: text-generation
 
 
 
 
 
 
8
  ---
9
 
10
  ![SauerkrautLM](images/hero.png "SauerkrautLM-7b-HerO")
11
  ## VAGO solutions SauerkrautLM-7b-HerO
12
- Introducing SauerkrautLM-v1 - Your German Language Powerhouse!
 
 
 
 
 
 
 
 
13
 
14
- We are thrilled to unveil our **very first release**, **SauerkrautLM-v1**. This remarkable creation marks a significant milestone as it is specifically **tailored for the German-speaking community**. In a landscape where German language models are scarce, we are proud to offer a solution that fills this void.
15
- What sets SauerkrautLM-v1 apart is its versatility. Whether you are an individual looking to harness its capabilities for personal use or a business seeking to integrate it into your projects, our model is designed to accommodate all. It operates under the Apache 2.0 License, providing you with the freedom to explore its potential in both private and commercial applications.
16
- Performance is at the heart of SauerkrautLM-v1. We put it to the **test using a customized version of MT-Bench for the German language**, and the results speak volumes. It currently stands as the most robust German Language Model on Hugging Face (based on german mt-bench results), showcasing its exceptional capabilities. Rest assured, this model is here to shine and set new standards. And the best thing is it comes in four different sizes (3B, 7B, 13B, 70B) to address your individual needs.
17
- Our model's journey began with meticulous training using an **augmented dataset within the QLoRA approach**. This is just the beginning of our model series, promising even more innovative and powerful solutions in the future.
18
-
19
- Join us on this exciting adventure as we redefine the possibilities of language modeling for the German-speaking world.
20
- SauerkrautLM-v1 is here to empower your language-related endeavors like never before.
21
 
22
  ## All HerO Models
23
 
@@ -38,7 +45,7 @@ Data augmentation techniques were used to grant grammatical, syntactical correct
38
 
39
  SauerkrautLM-7b-HerO was merged on 1 A100 with [mergekit](https://github.com/cg123/mergekit).
40
  The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca).
41
- We used the gradient SLURP method.
42
 
43
 
44
  - **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
@@ -66,7 +73,7 @@ Bitte erkläre mir, wie die Zusammenführung von Modellen durch bestehende Spitz
66
  score
67
  model turn
68
  SauerkrautLM-70b-v1 1 7.25000
69
- SauerkrautLM-7b-HerO 1 6.96875
70
  SauerkrautLM-7b-v1-mistral 1 6.30625
71
  leo-hessianai-13b-chat 1 6.18750
72
  SauerkrautLM-13b-v1 1 6.16250
@@ -85,7 +92,7 @@ open_llama_3b_v2 1 1.68750
85
  score
86
  model turn
87
  SauerkrautLM-70b-v1 2 6.83125
88
- SauerkrautLM-7b-HerO 2 6.30625
89
  vicuna-13b-v1.5 2 5.63125
90
  SauerkrautLM-13b-v1 2 5.34375
91
  SauerkrautLM-7b-v1-mistral 2 5.26250
@@ -104,7 +111,7 @@ Llama-2-7b 2 1.07500
104
  score
105
  model
106
  SauerkrautLM-70b-v1 7.040625
107
- SauerkrautLM-7b-HerO 6.637500
108
  SauerkrautLM-7b-v1-mistral 5.784375
109
  SauerkrautLM-13b-v1 5.753125
110
  vicuna-13b-v1.5 5.715625
@@ -125,7 +132,7 @@ Llama-2-7b 1.181250
125
  score
126
  model turn
127
  OpenHermes-2.5-Mistral-7B 1 8.21875
128
- SauerkrautLM-7b-HerO 1 8.03125
129
  Mistral-7B-OpenOrca 1 7.65625
130
  neural-chat-7b-v3-1 1 7.22500
131
 
@@ -133,7 +140,7 @@ neural-chat-7b-v3-1 1 7.22500
133
  score
134
  model turn
135
  OpenHermes-2.5-Mistral-7B 2 7.1000
136
- SauerkrautLM-7b-HerO 2 6.7875
137
  neural-chat-7b-v3-1 2 6.4000
138
  Mistral-7B-OpenOrca 2 6.1750
139
 
@@ -141,7 +148,7 @@ Mistral-7B-OpenOrca 2 6.1750
141
  score
142
  model
143
  OpenHermes-2.5-Mistral-7B 7.659375
144
- SauerkrautLM-7b-HerO 7.409375
145
  Mistral-7B-OpenOrca 6.915625
146
  neural-chat-7b-v3-1 6.812500
147
  ```
@@ -175,7 +182,4 @@ If you are interested in customized LLMs for business applications, please get i
175
  We are also keenly seeking support and investment for our startup, VAGO solutions, where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us.
176
 
177
  ## Acknowledgement
178
- Many thanks to [OpenOrca](https://huggingface.co/Open-Orca) and [teknium](https://huggingface.co/teknium) for providing such valuable models to the Open-Source community.
179
-
180
-
181
-
 
5
  - de
6
  library_name: transformers
7
  pipeline_tag: text-generation
8
+ tags:
9
+ - mistral
10
+ - finetune
11
+ - chatml
12
+ - augmentation
13
+ - german
14
  ---
15
 
16
  ![SauerkrautLM](images/hero.png "SauerkrautLM-7b-HerO")
17
  ## VAGO solutions SauerkrautLM-7b-HerO
18
+ Introducing **SauerkrautLM-7b-HerO** the pinnacle of German language model technology!
19
+ Crafted through the **merging** of **[Teknium's OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)** and **[Open-Orca's Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)**, this model is **uniquely fine-tuned with the Sauerkraut dataset.**
20
+ SauerkrautLM-7b-HerO represents a breakthrough in language modeling, achieving an optimal balance between extensive German data and essential international sources.
21
+ This ensures the model not only excels in understanding the nuances of the German language but also retains its global capabilities.
22
+ Harnessing the innovative power of the **gradient SLERP method from MergeKit**, we've achieved a groundbreaking fusion of two of the most best performing 7B models based on the Mistral framework.
23
+ This merge has allowed us to combine the best features of both models, creating an unparalleled synergy.
24
+ Coupled with the German Sauerkraut dataset, which consists of a mix of augmented and translated data, we have successfully taught the English-speaking merged model the intricacies of the German language.
25
+ This was achieved *without the typical loss of core competencies often associated with fine-tuning in another language of models previously trained mainly in English.*
26
+ Our approach ensures that the model retains its original strengths while acquiring a profound understanding of German, **setting a new benchmark in bilingual language model proficiency.**
27
 
 
 
 
 
 
 
 
28
 
29
  ## All HerO Models
30
 
 
45
 
46
  SauerkrautLM-7b-HerO was merged on 1 A100 with [mergekit](https://github.com/cg123/mergekit).
47
  The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca).
48
+ We applied the gradient SLURP method.
49
 
50
 
51
  - **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
 
73
  score
74
  model turn
75
  SauerkrautLM-70b-v1 1 7.25000
76
+ SauerkrautLM-7b-HerO <--- 1 6.96875
77
  SauerkrautLM-7b-v1-mistral 1 6.30625
78
  leo-hessianai-13b-chat 1 6.18750
79
  SauerkrautLM-13b-v1 1 6.16250
 
92
  score
93
  model turn
94
  SauerkrautLM-70b-v1 2 6.83125
95
+ SauerkrautLM-7b-HerO <--- 2 6.30625
96
  vicuna-13b-v1.5 2 5.63125
97
  SauerkrautLM-13b-v1 2 5.34375
98
  SauerkrautLM-7b-v1-mistral 2 5.26250
 
111
  score
112
  model
113
  SauerkrautLM-70b-v1 7.040625
114
+ SauerkrautLM-7b-HerO <--- 6.637500
115
  SauerkrautLM-7b-v1-mistral 5.784375
116
  SauerkrautLM-13b-v1 5.753125
117
  vicuna-13b-v1.5 5.715625
 
132
  score
133
  model turn
134
  OpenHermes-2.5-Mistral-7B 1 8.21875
135
+ SauerkrautLM-7b-HerO <--- 1 8.03125
136
  Mistral-7B-OpenOrca 1 7.65625
137
  neural-chat-7b-v3-1 1 7.22500
138
 
 
140
  score
141
  model turn
142
  OpenHermes-2.5-Mistral-7B 2 7.1000
143
+ SauerkrautLM-7b-HerO <--- 2 6.7875
144
  neural-chat-7b-v3-1 2 6.4000
145
  Mistral-7B-OpenOrca 2 6.1750
146
 
 
148
  score
149
  model
150
  OpenHermes-2.5-Mistral-7B 7.659375
151
+ SauerkrautLM-7b-HerO <--- 7.409375
152
  Mistral-7B-OpenOrca 6.915625
153
  neural-chat-7b-v3-1 6.812500
154
  ```
 
182
  We are also keenly seeking support and investment for our startup, VAGO solutions, where we continuously advance the development of robust language models designed to address a diverse range of purposes and requirements. If the prospect of collaboratively navigating future challenges excites you, we warmly invite you to reach out to us.
183
 
184
  ## Acknowledgement
185
+ Many thanks to [OpenOrca](https://huggingface.co/Open-Orca) and [teknium](https://huggingface.co/teknium) for providing such valuable models to the Open-Source community.