Update README.md
Browse files
README.md
CHANGED
@@ -34,7 +34,9 @@ Our approach ensures that the model retains its original strengths while acquiri
|
|
34 |
3. [Evaluation](#evaluation)
|
35 |
- [GPT4ALL](#gpt4all)
|
36 |
- [Language Model evaluation Harness](#language-model-evaluation-harness)
|
37 |
-
- [BigBench](#
|
|
|
|
|
38 |
- [MT-Bench (German)](#mt-bench-german)
|
39 |
- [MT-Bench (English)](#mt-bench-english)
|
40 |
- [Additional German Benchmark results](#additional-german-benchmark-results)
|
@@ -248,7 +250,10 @@ SauerkrautLM-3b-v1 2.581250
|
|
248 |
open_llama_3b_v2 1.456250
|
249 |
Llama-2-7b 1.181250
|
250 |
```
|
251 |
-
|
|
|
|
|
|
|
252 |
|
253 |
### MT-Bench (English):
|
254 |
![MT-Bench English Diagram](https://vago-solutions.de/wp-content/uploads/2023/11/MT-Bench-Englisch.png "SauerkrautLM-7b-HerO MT-Bench English Diagram")
|
|
|
34 |
3. [Evaluation](#evaluation)
|
35 |
- [GPT4ALL](#gpt4all)
|
36 |
- [Language Model evaluation Harness](#language-model-evaluation-harness)
|
37 |
+
- [BigBench](#bbh)
|
38 |
+
- [MMLU](#mmlu)
|
39 |
+
- [TruthfulQA](#truthfulqa)
|
40 |
- [MT-Bench (German)](#mt-bench-german)
|
41 |
- [MT-Bench (English)](#mt-bench-english)
|
42 |
- [Additional German Benchmark results](#additional-german-benchmark-results)
|
|
|
250 |
open_llama_3b_v2 1.456250
|
251 |
Llama-2-7b 1.181250
|
252 |
```
|
253 |
+
### MMLU:
|
254 |
+
![MMLU](https://vago-solutions.de/wp-content/uploads/2023/11/MMLU-Benchmark.png "SauerkrautLM-7b-HerO MMLU")
|
255 |
+
### TruthfulQA:
|
256 |
+
![TruthfulQA](https://vago-solutions.de/wp-content/uploads/2023/11/Truthfulqa-Benchmark.png "SauerkrautLM-7b-HerO TruthfulQA")
|
257 |
|
258 |
### MT-Bench (English):
|
259 |
![MT-Bench English Diagram](https://vago-solutions.de/wp-content/uploads/2023/11/MT-Bench-Englisch.png "SauerkrautLM-7b-HerO MT-Bench English Diagram")
|