DavidGF commited on
Commit
8497b1e
1 Parent(s): 953b240

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -4
README.md CHANGED
@@ -7,7 +7,7 @@ library_name: transformers
7
  pipeline_tag: text-generation
8
  ---
9
 
10
- ![SauerkrautLM](images/SauerkrautLM.png "SauerkrautLM")
11
  ## VAGO solutions SauerkrautLM
12
  Introducing SauerkrautLM-v1 - Your German Language Powerhouse!
13
 
@@ -37,9 +37,9 @@ Data augmentation techniques were used to grant grammatical, syntactical correct
37
 
38
  **Merge Procedure:**
39
 
40
- SauerkrautLM-7b-HerO was merged on 1 A100 with mergekit.
41
- The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
42
- We used the gradient slurp technique.
43
 
44
 
45
  - **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
@@ -111,8 +111,33 @@ Please tell me about how merged models can benefit from existent top-models.<|im
111
  | | |ter | 0.6463|± |0.0039|
112
  |xnli_de | 0|acc | 0.4547|± |0.0070|
113
  |xnli_en | 0|acc | 0.5595|± |0.0070|
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
114
  ```
115
 
 
116
  ## Disclaimer
117
  We must inform users that despite our best efforts in data cleansing, the possibility of some such content slipping through cannot be entirely ruled out.
118
  However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided.
 
7
  pipeline_tag: text-generation
8
  ---
9
 
10
+ ![SauerkrautLM](images/hero-multi.png "SauerkrautLM-7b-HerO-multilingual")
11
  ## VAGO solutions SauerkrautLM
12
  Introducing SauerkrautLM-v1 - Your German Language Powerhouse!
13
 
 
37
 
38
  **Merge Procedure:**
39
 
40
+ SauerkrautLM-7b-HerO was merged on 1 A100 with [mergekit](https://github.com/cg123/mergekit).
41
+ The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca).
42
+ We used the gradient SLURP method.
43
 
44
 
45
  - **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
 
111
  | | |ter | 0.6463|± |0.0039|
112
  |xnli_de | 0|acc | 0.4547|± |0.0070|
113
  |xnli_en | 0|acc | 0.5595|± |0.0070|
114
+ ```
115
+ **BBH**
116
+ ```
117
+ | Task |Version| Metric |Value | |Stderr|
118
+ |------------------------------------------------|------:|---------------------|-----:|---|-----:|
119
+ |bigbench_causal_judgement | 0|multiple_choice_grade|0.6053|± |0.0356|
120
+ |bigbench_date_understanding | 0|multiple_choice_grade|0.6992|± |0.0239|
121
+ |bigbench_disambiguation_qa | 0|multiple_choice_grade|0.3721|± |0.0302|
122
+ |bigbench_geometric_shapes | 0|multiple_choice_grade|0.1671|± |0.0197|
123
+ | | |exact_str_match |0.1003|± |0.0159|
124
+ |bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|0.2540|± |0.0195|
125
+ |bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|0.2043|± |0.0152|
126
+ |bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|0.4667|± |0.0289|
127
+ |bigbench_movie_recommendation | 0|multiple_choice_grade|0.3700|± |0.0216|
128
+ |bigbench_navigate | 0|multiple_choice_grade|0.4970|± |0.0158|
129
+ |bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|0.6965|± |0.0103|
130
+ |bigbench_ruin_names | 0|multiple_choice_grade|0.4152|± |0.0233|
131
+ |bigbench_salient_translation_error_detection | 0|multiple_choice_grade|0.1443|± |0.0111|
132
+ |bigbench_snarks | 0|multiple_choice_grade|0.6464|± |0.0356|
133
+ |bigbench_sports_understanding | 0|multiple_choice_grade|0.6846|± |0.0148|
134
+ |bigbench_temporal_sequences | 0|multiple_choice_grade|0.3150|± |0.0147|
135
+ |bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|0.2168|± |0.0117|
136
+ |bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|0.1537|± |0.0086|
137
+ |bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|0.4667|± |0.0289|
138
  ```
139
 
140
+
141
  ## Disclaimer
142
  We must inform users that despite our best efforts in data cleansing, the possibility of some such content slipping through cannot be entirely ruled out.
143
  However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided.