jphme commited on
Commit
aa63a32
1 Parent(s): 52cc765

update readme

Browse files
Files changed (2) hide show
  1. README.md +139 -0
  2. em_model_logo_web.jpeg +0 -0
README.md CHANGED
@@ -1,3 +1,142 @@
1
  ---
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ inference: false
3
+ language:
4
+ - de
5
+ library_name: transformers
6
  license: apache-2.0
7
+ model_creator: jphme
8
+ model_name: EM German
9
+ model_type: mistral
10
+ pipeline_tag: text-generation
11
+ prompt_template: 'Du bist ein hilfreicher Assistent. USER: Was ist 1+1? ASSISTANT:'
12
+ tags:
13
+ - pytorch
14
+ - german
15
+ - deutsch
16
+ - mistral
17
+ - leolm
18
  ---
19
+ ![EM Logo](em_model_logo_web.jpeg)
20
+
21
+ In our opinion, this is the strongest open 7b model for German-language applications.
22
+
23
+ **Many thanks to the [LeoLM](https://huggingface.co/LeoLM) team for the publication of a base model that has received continued pretraining with German texts, greatly improving generation capabilities.**
24
+
25
+ *Please note that the Mistral architecture is very recent and still not supported by all libraries (e.g. AutoGPTQ). In case of any problems, please try a different format/base model.*
26
+
27
+ # Table of Contents
28
+
29
+ 1. [Introduction](#introduction)
30
+ 2. [Links & Demos](#links--demos)
31
+ - [Model Links](#model-links)
32
+ - [Demos](#demos)
33
+ 3. [Prompt Format](#prompt-format)
34
+ 4. [Example Output](#example-output)
35
+ 5. [Acknowledgements](#acknowledgements)
36
+ 6. [Contact](#contact)
37
+ 7. [Disclaimer](#disclaimer)
38
+
39
+ # Introduction
40
+
41
+ **EM German** is a Llama2/Mistral/LeoLM-based model family, finetuned on a large dataset of various instructions in German language. The models are optimized for German text, providing proficiency in understanding, generating, and interacting with German language content.
42
+
43
+ We offer versions based on 7b, 13b and 70b Llama-2, Mistral and LeoLM (Llama-2/Mistral with continued pretraining on German texts) models.
44
+
45
+ Please find all Informations, Example Outputs, the special RAG prompt format, output examples and eval results for the EM German Model family in [our Github Repository](https://github.com/jphme/EM_German). ([Deutsche Version](https://github.com/jphme/EM_German/blob/main/README_DE.md))
46
+
47
+
48
+ # Links & Demos
49
+
50
+ ## Model Links
51
+
52
+ Should you try only one model version, I strongly recommend the **LeoLM Mistral** model which offers by far the best combination of performance and computing requirements!
53
+
54
+ | Base Model | HF | GPTQ | GGUF | AWQ |
55
+ |-------|-------|-------|-------|-------|
56
+ | Llama2 7b | [Link](https://huggingface.co/jphme/em_german_7b_v01) | [Link](https://huggingface.co/TheBloke/em_german_7b_v01-GPTQ) | [Link](https://huggingface.co/TheBloke/em_german_7b_v01-GGUF) | [Link](https://huggingface.co/TheBloke/em_german_7b_v01-AWQ) |
57
+ | Llama2 13b | [Link](https://huggingface.co/jphme/em_german_13b_v01) | [Link](https://huggingface.co/TheBloke/em_german_13b_v01-GPTQ) | [Link](https://huggingface.co/TheBloke/em_german_13b_v01-GGUF) | [Link](https://huggingface.co/TheBloke/em_german_13b_v01-AWQ) |
58
+ | Llama2 70b | [Link](https://huggingface.co/jphme/em_german_70b_v01) | [Link](https://huggingface.co/TheBloke/em_german_70b_v01-GPTQ) | [Link](https://huggingface.co/TheBloke/em_german_70b_v01-GGUF) | [Link](https://huggingface.co/TheBloke/em_german_70b_v01-AWQ) |
59
+ | [Mistral 7b](https://huggingface.co/mistralai/Mistral-7B-v0.1) | [Link](https://huggingface.co/jphme/em_german_mistral_v01) | [Link](https://huggingface.co/TheBloke/em_german_mistral_v01-GPTQ) | [Link](https://huggingface.co/TheBloke/em_german_mistral_v01-GGUF) | [Link](https://huggingface.co/TheBloke/em_german_mistral_v01-AWQ) |
60
+ | [LeoLM 7b](https://huggingface.co/LeoLM/leo-hessianai-7b) | [Link](https://huggingface.co/jphme/em_german_7b_leo) | [Link](https://huggingface.co/jphme/em_german_7b_leo_gptq) | [Link](hhttps://huggingface.co/jphme/em_german_7b_leo_gguf) | tbc |
61
+ | [LeoLM 13b](https://huggingface.co/LeoLM/leo-hessianai-13b) | soon | soon | [Link](https://huggingface.co/jphme/em_german_13b_leo_gguf) | tbc |
62
+ | [LeoLM Mistral 7b](tbc) | [Link](https://huggingface.co/jphme/em_german_leo_mistral) | soon | [Link](https://huggingface.co/jphme/em_german_leo_mistral_gguf) | tbc |
63
+
64
+ ### Notes about the different versions:
65
+ See also the [comparison of example outputs](https://github.com/jphme/EM_German/blob/main/example_output_comparison.md) for a comparison of (7b) model capabilities.
66
+
67
+ If you get unsatisfying results with one or another EM German version, please try a different (and/or larger) model or version for your usecase.
68
+
69
+
70
+ ## Demos:
71
+
72
+ You can use some of the models with **free** google Colab instances (e.g. the 7bn model in 8bit or the 13b model with GPTQ):
73
+
74
+ * [Example Colab Notebook for 13b with GPTQ](https://colab.research.google.com/drive/1IJfJdVwGkfe5MYOqHptystR3FBeEUdGn?usp=sharing)
75
+ * [Example Colab Notebook for 7b with 8bit-Loading](https://colab.research.google.com/drive/1bsv6vkLM4AlCpSyXA6ol9P32zxZmf7Zu?usp=sharing)
76
+ * [Example Colab Notebook for 7b Mistral GGUF with Grammar-based structured output](https://colab.research.google.com/drive/17O-a3D4BnVc_V9Of3dFfed18v1ULu1Sv?usp=sharing)
77
+
78
+
79
+ # Prompt Format
80
+
81
+ This model follows the Vicuna format without linebreaks (but should work with linebreaks as well). The format is as follows:
82
+
83
+ ```
84
+ Du bist ein hilfreicher Assistent. USER: <instruction> ASSISTANT:
85
+ ```
86
+
87
+ You can swap the standard system prompt for a better suited one (see below for RAG-tasks).
88
+
89
+ # Example Output
90
+
91
+ Please find a selection of various example outputs for all 7b versions of EM German [here](https://github.com/jphme/EM_German/blob/main/example_output_comparison.md).
92
+
93
+ **Prompt:**
94
+ > Erkläre die Bedeutung von drei wesentlichen Elementen der Bilanzierung und des Accountings nach IFRS für Unternehmen, die auf internationaler Ebene tätig sind.
95
+
96
+ **Antwort EM German Leo Mistral 7b:**
97
+ ```
98
+ Die International Financial Reporting Standards (IFRS) sind eine Reihe von internationalen Rechnungslegungsstandards, die von der International Accounting Standards Board (IASB) entwickelt wurden. Sie sollen dazu beitragen, dass Finanzberichte vergleichbar, zuverlässig und relevant sind.
99
+
100
+ 1. Fair Value: Dieses Konzept ist ein zentraler Bestandteil der IFRS. Es besagt, dass Vermögenswerte und Schulden zum fairen Wert bilanziert werden sollten, was dem Preis entspricht, zu dem sie zwischen Marktteilnehmern gehandelt werden könnten. Dadurch wird sichergestellt, dass das Unternehmen eine genaue Darstellung seiner finanziellen Lage liefert.
101
+
102
+ 2. Konsolidierung: Die IFRS verlangen von Unternehmen, ihre Finanzergebnisse auf konsolidierter Basis darzustellen. Das bedeutet, dass alle Tochtergesellschaften und verbundenen Unternehmen in die Bilanz des Mutterunternehmens aufgenommen werden. Dadurch wird ein klareres Bild von der finanziellen Situation des Konzerns geschaffen.
103
+
104
+ 3. Vollständigkeitsprinzip: Die IFRS verlangen von Unternehmen, alle relevanten Informationen offenzulegen. Dies umfasst sowohl positive als auch negative Faktoren, wie zum Beispiel Gewinne und Verluste, Einnahmen und Ausgaben sowie Risiken und Unsicherheiten. Durch dieses Prinzip wird sichergestellt, dass Investoren, Kreditgeber und andere Stakeholder eine umfassende Einschätzung der finanziellen Lage des Unternehmens vornehmen können.
105
+ ```
106
+
107
+ **Prompt:**
108
+ > Ein Mensch atmet durchschnittlich 16 Mal pro Minute. Wie oft atmet er an einem Tag?
109
+
110
+ **Antwort EM German Leo Mistral 7b:**
111
+ ```
112
+ Zuerst berechnen wir, wie oft eine Person in einer Stunde atmet. Wir wissen, dass eine Stunde 60 Minuten hat, also:
113
+
114
+ Atemzüge pro Stunde = Atemfrequenz * Minuten pro Stunde
115
+ = 16 Atemzüge pro Minute * 60 Minuten
116
+ = 960 Atemzüge
117
+
118
+ Um herauszufinden, wie viele Atemzüge eine Person an einem Tag macht, müssen wir diese stündliche Rate mit der Anzahl der Stunden an einem Tag multiplizieren, also 24:
119
+
120
+ Atemzüge pro Tag = Atemzüge pro Stunde * Stunden pro Tag
121
+ = 960 Atemzüge * 24 Stunden
122
+ = 23.040 Atemzüge
123
+
124
+ Also macht ein durchschnittlicher Mensch etwa 23.040 Atemzüge an einem Tag.
125
+ ```
126
+ (For more examples, please visit [our Github Repository](https://github.com/jphme/EM_German).)
127
+
128
+ # Acknowledgements:
129
+
130
+ Many thanks to [winglian/caseus](https://huggingface.co/winglian) for his great work on Axolotl which I used to train the EM mdoels. I am also grateful to [Jon Durbin](https://huggingface.co/jondurbin) and his [Airoboros](https://huggingface.co/jondurbin/airoboros-l2-70b-2.2.1) models and code from which I borrowed many ideas and code snippets.
131
+ Additionally many thanks to [Björn Plüster](https://huggingface.co/bjoernp) and the LeoLM team for the outstanding pretraining work on LeoLM and last but not least many many thanks to [TheBloke](https://huggingface.co/TheBloke) for the preparation of quantized versions in all formats under the sun.
132
+ The 70b model was trained with support of the [OVH Cloud Startup Program](https://startup.ovhcloud.com/en/).
133
+
134
+ # Contact
135
+
136
+ I you are interested in customized LLMs for business applications, please get in contact with me via [my website](https://www.jph.me). I am also always happy about suggestions and feedback.
137
+
138
+ *PS: We are also always interested in support for our startup [ellamind](https://ellamind.com), which will offer customized models for business applications in the future (we are currently still in stealth mode). If you use our models for business applications and have advanced needs for specialized capabilities, please get in touch.*
139
+
140
+ # Disclaimer:
141
+
142
+ I am not responsible for the actions of third parties who use this model or the outputs of the model. This model should only be used for research purposes. The original base model license applies and is distributed with the model files.
em_model_logo_web.jpeg ADDED