DavidGF commited on
Commit
e3c685d
1 Parent(s): 99e3272

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -13
README.md CHANGED
@@ -25,6 +25,25 @@ Coupled with the German Sauerkraut dataset, which consists of a mix of augmented
25
  This was achieved *without the typical loss of core competencies often associated with fine-tuning in another language of models previously trained mainly in English.*
26
  Our approach ensures that the model retains its original strengths while acquiring a profound understanding of German, **setting a new benchmark in bilingual language model proficiency.**
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## All HerO Models
30
 
@@ -34,26 +53,26 @@ Our approach ensures that the model retains its original strengths while acquiri
34
 
35
  ## Model Details
36
  **SauerkrautLM-7b-HerO**
 
 
 
 
37
 
38
- **Training Dataset:**
39
 
40
  SauerkrautLM-7b-HerO was trained with mix of German data augmentation and translated data.
41
  We found, that only a simple translation of training data can lead to unnatural German phrasings.
42
  Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
43
 
44
- **Merge Procedure:**
45
 
46
  SauerkrautLM-7b-HerO was merged on 1 A100 with [mergekit](https://github.com/cg123/mergekit).
47
  The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca).
48
  We applied the gradient SLURP method.
49
 
50
 
51
- - **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
52
- - **Language(s):** English, German
53
- - **License:** APACHE 2.0
54
- - **Contact:** [Website](https://vago-solutions.de/#Kontakt) [David Golchinfar](mailto:golchinfar@vago-solutions.de)
55
 
56
- **Prompt Template:**
57
  ```
58
  <|im_start|>system
59
  Du bist Sauerkraut-HerO, ein großes Sprachmodell, das höflich und kompetent antwortet. Schreibe deine Gedanken Schritt für Schritt auf, um Probleme sinnvoll zu lösen.
@@ -67,7 +86,7 @@ Bitte erkläre mir, wie die Zusammenführung von Modellen durch bestehende Spitz
67
  <|im_start|>assistant
68
  ```
69
  ## Evaluation
70
- **MT-Bench (German)**
71
  ```
72
  ########## First turn ##########
73
  score
@@ -126,7 +145,7 @@ SauerkrautLM-3b-v1 2.581250
126
  open_llama_3b_v2 1.456250
127
  Llama-2-7b 1.181250
128
  ```
129
- **MT-Bench (English)**
130
  ```
131
  ########## First turn ##########
132
  score
@@ -154,20 +173,20 @@ neural-chat-7b-v3-1 6.812500
154
  ```
155
 
156
 
157
- **Language Model evaluation Harness**
158
  Compared to Aleph Alpha Luminous Models:
159
  ![Harness](images/luminouscompare.PNG "SauerkrautLM-7b-HerO Harness")
160
 
161
  *performed with newest Language Model Evaluation Harness
162
- **BBH**
163
  ![BBH](images/bbh.PNG "SauerkrautLM-7b-HerO BBH")
164
  *performed with newest Language Model Evaluation Harness
165
- **GPT4ALL**
166
  Compared to Aleph Alpha Luminous Models, LeoLM and EM_German:
167
  ![GPT4ALL diagram](images/gpt4alldiagram.PNG "SauerkrautLM-7b-HerO GPT4ALL Diagram")
168
 
169
  ![GPT4ALL table](images/gpt4alltable.PNG "SauerkrautLM-7b-HerO GPT4ALL Table")
170
- **Additional German Benchmark results**
171
  ![GermanBenchmarks](images/germanbench.PNG "SauerkrautLM-7b-HerO German Benchmarks")
172
  *performed with newest Language Model Evaluation Harness
173
  ## Disclaimer
 
25
  This was achieved *without the typical loss of core competencies often associated with fine-tuning in another language of models previously trained mainly in English.*
26
  Our approach ensures that the model retains its original strengths while acquiring a profound understanding of German, **setting a new benchmark in bilingual language model proficiency.**
27
 
28
+ # Table of Contents
29
+ 1. [Overview of all Her0 models](#all-hero-models)
30
+ 2. [Model Details](#model-details)
31
+ -[Prompt template](#prompt-template)
32
+ -[Training Dataset](#training-dataset)
33
+ -[Merge Procedure](#merge-procedure)
34
+ 3. [Evaluation](#evaluation)
35
+ - [MT-Bench (German)](#mt-bench-(german))
36
+ - [MT-Bench (English)](#mt-bench-(english))
37
+ - [Language Model evaluation Harness](#language-model-evaluation-harness)
38
+ - [BigBench](#BBH)
39
+ - [GPT4ALL](#gpt4all)
40
+ - [Additional German Benchmark results](#additional-german-benchmark-results)
41
+ - [GPT4ALL](#gpt4all)
42
+ 4. [Disclaimer](#disclaimer)
43
+ 5. [Contact](#contact)
44
+ 7. [Collaborations](#collaborations)
45
+ 8. [Acknowledgement](#acknowledgement)
46
+
47
 
48
  ## All HerO Models
49
 
 
53
 
54
  ## Model Details
55
  **SauerkrautLM-7b-HerO**
56
+ - **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
57
+ - **Language(s):** English, German
58
+ - **License:** APACHE 2.0
59
+ - **Contact:** [Website](https://vago-solutions.de/#Kontakt) [David Golchinfar](mailto:golchinfar@vago-solutions.de)
60
 
61
+ #**Training Dataset:**
62
 
63
  SauerkrautLM-7b-HerO was trained with mix of German data augmentation and translated data.
64
  We found, that only a simple translation of training data can lead to unnatural German phrasings.
65
  Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
66
 
67
+ #**Merge Procedure:**
68
 
69
  SauerkrautLM-7b-HerO was merged on 1 A100 with [mergekit](https://github.com/cg123/mergekit).
70
  The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca).
71
  We applied the gradient SLURP method.
72
 
73
 
 
 
 
 
74
 
75
+ # **Prompt Template:**
76
  ```
77
  <|im_start|>system
78
  Du bist Sauerkraut-HerO, ein großes Sprachmodell, das höflich und kompetent antwortet. Schreibe deine Gedanken Schritt für Schritt auf, um Probleme sinnvoll zu lösen.
 
86
  <|im_start|>assistant
87
  ```
88
  ## Evaluation
89
+ #**MT-Bench (German)**
90
  ```
91
  ########## First turn ##########
92
  score
 
145
  open_llama_3b_v2 1.456250
146
  Llama-2-7b 1.181250
147
  ```
148
+ #**MT-Bench (English)**
149
  ```
150
  ########## First turn ##########
151
  score
 
173
  ```
174
 
175
 
176
+ #**Language Model evaluation Harness**
177
  Compared to Aleph Alpha Luminous Models:
178
  ![Harness](images/luminouscompare.PNG "SauerkrautLM-7b-HerO Harness")
179
 
180
  *performed with newest Language Model Evaluation Harness
181
+ #**BBH**
182
  ![BBH](images/bbh.PNG "SauerkrautLM-7b-HerO BBH")
183
  *performed with newest Language Model Evaluation Harness
184
+ #**GPT4ALL**
185
  Compared to Aleph Alpha Luminous Models, LeoLM and EM_German:
186
  ![GPT4ALL diagram](images/gpt4alldiagram.PNG "SauerkrautLM-7b-HerO GPT4ALL Diagram")
187
 
188
  ![GPT4ALL table](images/gpt4alltable.PNG "SauerkrautLM-7b-HerO GPT4ALL Table")
189
+ #**Additional German Benchmark results**
190
  ![GermanBenchmarks](images/germanbench.PNG "SauerkrautLM-7b-HerO German Benchmarks")
191
  *performed with newest Language Model Evaluation Harness
192
  ## Disclaimer