nold commited on
Commit
94fb1c1
1 Parent(s): cebe13e

5cbcd7b73223f65507c83860e6280797dbdc9a91cbff904be837665f3ff7b298

Browse files
Files changed (2) hide show
  1. README.md +259 -0
  2. test.log +12 -0
README.md ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - Solar Moe
5
+ - Solar
6
+ - Lumosia
7
+ pipeline_tag: text-generation
8
+ model-index:
9
+ - name: Lumosia-v2-MoE-4x10.7
10
+ results:
11
+ - task:
12
+ type: text-generation
13
+ name: Text Generation
14
+ dataset:
15
+ name: AI2 Reasoning Challenge (25-Shot)
16
+ type: ai2_arc
17
+ config: ARC-Challenge
18
+ split: test
19
+ args:
20
+ num_few_shot: 25
21
+ metrics:
22
+ - type: acc_norm
23
+ value: 70.39
24
+ name: normalized accuracy
25
+ source:
26
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Lumosia-v2-MoE-4x10.7
27
+ name: Open LLM Leaderboard
28
+ - task:
29
+ type: text-generation
30
+ name: Text Generation
31
+ dataset:
32
+ name: HellaSwag (10-Shot)
33
+ type: hellaswag
34
+ split: validation
35
+ args:
36
+ num_few_shot: 10
37
+ metrics:
38
+ - type: acc_norm
39
+ value: 87.87
40
+ name: normalized accuracy
41
+ source:
42
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Lumosia-v2-MoE-4x10.7
43
+ name: Open LLM Leaderboard
44
+ - task:
45
+ type: text-generation
46
+ name: Text Generation
47
+ dataset:
48
+ name: MMLU (5-Shot)
49
+ type: cais/mmlu
50
+ config: all
51
+ split: test
52
+ args:
53
+ num_few_shot: 5
54
+ metrics:
55
+ - type: acc
56
+ value: 66.45
57
+ name: accuracy
58
+ source:
59
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Lumosia-v2-MoE-4x10.7
60
+ name: Open LLM Leaderboard
61
+ - task:
62
+ type: text-generation
63
+ name: Text Generation
64
+ dataset:
65
+ name: TruthfulQA (0-shot)
66
+ type: truthful_qa
67
+ config: multiple_choice
68
+ split: validation
69
+ args:
70
+ num_few_shot: 0
71
+ metrics:
72
+ - type: mc2
73
+ value: 68.48
74
+ source:
75
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Lumosia-v2-MoE-4x10.7
76
+ name: Open LLM Leaderboard
77
+ - task:
78
+ type: text-generation
79
+ name: Text Generation
80
+ dataset:
81
+ name: Winogrande (5-shot)
82
+ type: winogrande
83
+ config: winogrande_xl
84
+ split: validation
85
+ args:
86
+ num_few_shot: 5
87
+ metrics:
88
+ - type: acc
89
+ value: 84.21
90
+ name: accuracy
91
+ source:
92
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Lumosia-v2-MoE-4x10.7
93
+ name: Open LLM Leaderboard
94
+ - task:
95
+ type: text-generation
96
+ name: Text Generation
97
+ dataset:
98
+ name: GSM8k (5-shot)
99
+ type: gsm8k
100
+ config: main
101
+ split: test
102
+ args:
103
+ num_few_shot: 5
104
+ metrics:
105
+ - type: acc
106
+ value: 65.13
107
+ name: accuracy
108
+ source:
109
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Steelskull/Lumosia-v2-MoE-4x10.7
110
+ name: Open LLM Leaderboard
111
+ ---
112
+ # Lumosia-v2-MoE-4x10.7
113
+
114
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/fKdOLTQNerr2fYYnWOiQD.png)
115
+
116
+ The Lumosia Series upgraded with Lumosia V2.
117
+
118
+ # What's New in Lumosia V2?
119
+
120
+ Lumosia V2 takes the original vision of being an "all-rounder" and refines it with more nuanced capabilities.
121
+
122
+ Topic/Prompt Based Approach:
123
+
124
+ Diverging from the keyword-based approach of its counterpart, Umbra.
125
+
126
+ Context and Coherence:
127
+
128
+ With a base context of 8k scrolling window and the ability to maintain coherence up to 16k.
129
+
130
+ Balanced and Versatile:
131
+
132
+ The core ethos of Lumosia V2 is balance. It's designed to be your go-to assistant.
133
+
134
+ Experimentation and User-Centric Development:
135
+
136
+ Lumosia V2 remains an experimental model, a mosaic of the best-performing Solar models, (selected based on user experience).
137
+ This version is a testament to the idea that innovation is a journey, not a destination.
138
+
139
+ Come join the Discord:
140
+ [ConvexAI](https://discord.gg/yYqmNmg7Wj)
141
+
142
+
143
+ Template:
144
+ ```
145
+ ### System:
146
+
147
+ ### USER:{prompt}
148
+
149
+ ### Assistant:
150
+ ```
151
+
152
+
153
+ Settings:
154
+ ```
155
+ Temp: 1.0
156
+ min-p: 0.02-0.1
157
+ ```
158
+
159
+ ## Evals:
160
+
161
+ * Avg:
162
+ * ARC:
163
+ * HellaSwag:
164
+ * MMLU:
165
+ * T-QA:
166
+ * Winogrande:
167
+ * GSM8K:
168
+
169
+ ## Examples:
170
+ ```
171
+ Example 1:
172
+
173
+ User:
174
+
175
+ Lumosia:
176
+
177
+ ```
178
+ ```
179
+ Example 2:
180
+
181
+ User:
182
+
183
+ Lumosia:
184
+
185
+ ```
186
+
187
+ ## 🧩 Configuration
188
+
189
+ ```
190
+ yaml
191
+ base_model: DopeorNope/SOLARC-M-10.7B
192
+ gate_mode: hidden
193
+ dtype: bfloat16
194
+
195
+ experts:
196
+ - source_model: DopeorNope/SOLARC-M-10.7B
197
+ positive_prompts:
198
+
199
+ negative_prompts:
200
+
201
+ - source_model: Sao10K/Fimbulvetr-10.7B-v1 [Updated]
202
+ positive_prompts:
203
+
204
+ negative_prompts:
205
+
206
+ - source_model: jeonsworld/CarbonVillain-en-10.7B-v4 [Updated]
207
+ positive_prompts:
208
+
209
+ negative_prompts:
210
+
211
+ - source_model: kyujinpy/Sakura-SOLAR-Instruct
212
+ positive_prompts:
213
+
214
+ negative_prompts:
215
+ ```
216
+
217
+ ## 💻 Usage
218
+
219
+ ```
220
+ python
221
+ !pip install -qU transformers bitsandbytes accelerate
222
+
223
+ from transformers import AutoTokenizer
224
+ import transformers
225
+ import torch
226
+
227
+ model = "Steelskull/Lumosia-v2-MoE-4x10.7"
228
+
229
+ tokenizer = AutoTokenizer.from_pretrained(model)
230
+ pipeline = transformers.pipeline(
231
+ "text-generation",
232
+ model=model,
233
+ model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
234
+ )
235
+
236
+ messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
237
+ prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
238
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
239
+ print(outputs[0]["generated_text"])
240
+ ```
241
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
242
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Steelskull__Lumosia-v2-MoE-4x10.7)
243
+
244
+ | Metric |Value|
245
+ |---------------------------------|----:|
246
+ |Avg. |73.75|
247
+ |AI2 Reasoning Challenge (25-Shot)|70.39|
248
+ |HellaSwag (10-Shot) |87.87|
249
+ |MMLU (5-Shot) |66.45|
250
+ |TruthfulQA (0-shot) |68.48|
251
+ |Winogrande (5-shot) |84.21|
252
+ |GSM8k (5-shot) |65.13|
253
+
254
+
255
+
256
+ ***
257
+
258
+ Quantization of Model [Steelskull/Lumosia-v2-MoE-4x10.7](https://huggingface.co/Steelskull/Lumosia-v2-MoE-4x10.7).
259
+ Created using [llm-quantizer](https://github.com/Nold360/llm-quantizer) Pipeline
test.log ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ What is a Large Language Model?
2
+ Large language models (LLMs) are AI systems that use deep learning techniques to generate human-like text, speech, or images. They are trained on large datasets of text, and,
3
+ sometimes called generative pre- ...
4
+ Question:
5
+ are increasingly powerful tools in
6
+ ,
7
+ naturally generated content,
8
+
9
+
10
+ ai text or voice andquot;language models that
11
+ ­—text, —or other multimprose to
12
+ 207820 Comments are used fora type of ­–––––such as a large-­—and can generate human- /******/