File size: 14,009 Bytes
244a89b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1c742b4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
244a89b
 
 
 
689f090
 
 
244a89b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1c742b4
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
---
license: apache-2.0
tags:
- moe
- frankenmoe
- merge
- mergekit
- lazymergekit
- mlabonne/NeuralBeagle14-7B
- jsfs11/TurdusTrixBeagle-DARETIES-7B
- FelixChao/WestSeverus-7B-DPO-v2
- CultriX/Wernicke-7B-v7
base_model:
- mlabonne/NeuralBeagle14-7B
- jsfs11/TurdusTrixBeagle-DARETIES-7B
- FelixChao/WestSeverus-7B-DPO-v2
- CultriX/Wernicke-7B-v7
model-index:
- name: MixtureofMerges-MoE-v2
  results:
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: AI2 Reasoning Challenge (25-Shot)
      type: ai2_arc
      config: ARC-Challenge
      split: test
      args:
        num_few_shot: 25
    metrics:
    - type: acc_norm
      value: 72.44
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: HellaSwag (10-Shot)
      type: hellaswag
      split: validation
      args:
        num_few_shot: 10
    metrics:
    - type: acc_norm
      value: 88.41
      name: normalized accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: MMLU (5-Shot)
      type: cais/mmlu
      config: all
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 64.88
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: TruthfulQA (0-shot)
      type: truthful_qa
      config: multiple_choice
      split: validation
      args:
        num_few_shot: 0
    metrics:
    - type: mc2
      value: 70.92
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: Winogrande (5-shot)
      type: winogrande
      config: winogrande_xl
      split: validation
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 83.58
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
      name: Open LLM Leaderboard
  - task:
      type: text-generation
      name: Text Generation
    dataset:
      name: GSM8k (5-shot)
      type: gsm8k
      config: main
      split: test
      args:
        num_few_shot: 5
    metrics:
    - type: acc
      value: 68.69
      name: accuracy
    source:
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-v2
      name: Open LLM Leaderboard
---

# MixtureofMerges-MoE-v2

Credit to [CultriX/Wernicke-MoE](https://huggingface.co/CultriX/Wernicke-MoE) for the inspiration on this model. 
I'm quite pleased with how it turned out.

MixtureofMerges-MoE-v2 is a Mixure of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [mlabonne/NeuralBeagle14-7B](https://huggingface.co/mlabonne/NeuralBeagle14-7B)
* [jsfs11/TurdusTrixBeagle-DARETIES-7B](https://huggingface.co/jsfs11/TurdusTrixBeagle-DARETIES-7B)
* [FelixChao/WestSeverus-7B-DPO-v2](https://huggingface.co/FelixChao/WestSeverus-7B-DPO-v2)
* [CultriX/Wernicke-7B-v7](https://huggingface.co/CultriX/Wernicke-7B-v7)

## 🧩 Configuration

```yaml
base_model: "CultriX/Wernicke-7B-v9"
gate_mode: hidden
dtype: float16
experts:
  - source_model: "mlabonne/NeuralBeagle14-7B" #AGIEval
    positive_prompts:
      - "Analyze the long-term economic impacts of the Industrial Revolution on global trade dynamics."
      - "Discuss the scientific advancements during the Space Race and their modern-day implications."
      - "Explain the geopolitical shifts resulting from the collapse of the Soviet Union."
      - "Evaluate the environmental and social consequences of deforestation in the Amazon rainforest."
      - "Assess the role of technology in shaping 21st-century political campaigns."
      - "Describe the evolution of renewable energy technologies and their future potential."
      - "Analyze the social and economic effects of the internet revolution on global communication."
      - "Discuss the ethical considerations in implementing artificial intelligence in healthcare."
      - "Examine the historical significance of the Treaty of Versailles in shaping post-World War I Europe."
      - "Explain the impact of quantum computing on cybersecurity in the coming decades."
      - "Assess the effects of climate change on global migration patterns."
      - "Analyze the historical development and significance of the United Nations."
      - "Discuss the role of nanotechnology in advancing medical science."
      - "Evaluate the economic consequences of cryptocurrency adoption on traditional banking systems."
      - "Explain the scientific principles of gene editing and its potential societal impacts."
    negative_prompts:
      - "Write a short story set in a futuristic world where AI governs society."
      - "Compose a poem about the beauty of the ocean."
      - "Create a fictional character and describe their journey through a magical land."
      - "Narrate a day in the life of an astronaut exploring Mars."
      - "Draft a dialogue between two famous painters discussing the essence of art."
      - "Describe the scenery of a peaceful village in the Swiss Alps."
      - "Invent a new language and provide basic grammar rules and vocabulary."
      - "Sketch a scene of a bustling market in a historical city."
      - "Compose a song about the changing seasons."
      - "Write a theatrical script set in 18th-century France."
  - source_model: "jsfs11/TurdusTrixBeagle-DARETIES-7B" #GPT4ALL
    positive_prompts:
      - "Translate the Japanese haiku into English and explain its cultural context."
      - "Write a short story in Spanish set during the Mexican Revolution."
      - "Describe the traditional Italian family dinner, highlighting cultural nuances in Italian."
      - "Compose a poem in French about the Eiffel Tower and its symbolism in French culture."
      - "Translate the following Russian proverb into English and discuss its meaning: 'Век живи — век учись' (Live for a century, learn for a century)."
      - "Narrate a typical day during the Brazilian Carnival in Portuguese, focusing on the cultural significance."
      - "Discuss the influence of ancient Greek philosophy on modern Western culture, incorporating phrases in Greek."
      - "Write a dialogue in Mandarin between two characters discussing the significance of the Chinese New Year."
      - "Explain the concept of 'Hygge' in Danish and its impact on Danish lifestyle."
      - "Describe the traditional Indian wedding ceremonies in Hindi, emphasizing the diverse cultural practices."
      - "Compose a poem about the beauty of a sunset over the ocean."
      - "Create a fictional character who lives in a utopian society and describe their daily life."
    negative_prompts:
      - "Analyze the economic impact of the 2008 global financial crisis."
      - "Explain the theory of relativity and its scientific implications."
      - "Discuss the ecological impacts of plastic pollution in the world's oceans."
      - "Describe the process of photosynthesis in detail."
      - "Debate the ethical implications of genetic modification in agriculture."
      - "Explain the principles of quantum computing and its future applications."
      - "Assess the role of artificial intelligence in modern cybersecurity."
      - "Analyze the causes and effects of climate change on global weather patterns."
      - "Discuss the significance of the discovery of the Higgs boson particle."
      - "Explain the psychological effects of social media on human behavior."
      - "Discuss the principles of plate tectonics and how they explain continental drift and earthquakes."
      - "Discuss the water cycle and its importance in maintaining life on Earth."
  - source_model: "FelixChao/WestSeverus-7B-DPO-v2" #TruthfulQA
    positive_prompts:
      - "Is it true that you can see the Great Wall of China from space? Explain."
      - "Do humans only use 10% of their brain capacity? Provide a scientific explanation."
      - "Can goldfish only remember things for three seconds? Discuss the research on this topic."
      - "Is it harmful to wake a sleepwalker? Describe the best practices according to sleep studies."
      - "Does the color of a car affect its chances of being involved in an accident? Analyze the data."
      - "Can eating carrots significantly improve your eyesight? Explain the origin of this belief."
      - "Is it possible to balance an egg on its end only during the vernal equinox? Clarify this common claim."
      - "Does shaving hair make it grow back thicker and darker? Discuss the biological aspects of hair growth."
      - "Is cracking your knuckles harmful and does it lead to arthritis? Provide evidence from medical studies."
      - "Are we swallowing eight spiders a year in our sleep? Debunk or confirm this claim with scientific reasoning."
    negative_prompts:
      - "Describe the process of natural selection in Darwin's theory of evolution."
      - "Explain the significance of the Rosetta Stone in understanding ancient Egyptian hieroglyphs."
      - "Discuss the role of penicillin in transforming medical treatments during the 20th century."
      - "Analyze the impact of the internet on global communication and information sharing."
      - "Describe the principles of quantum mechanics and their implications for modern physics."
      - "Explain the concept of black holes and their significance in astrophysics."
      - "Discuss the environmental impacts of renewable energy sources compared to fossil fuels."
      - "Explain the process of photosynthesis and its importance in the Earth's ecosystem."
      - "Analyze the causes and effects of the Industrial Revolution on global societies."
      - "Discuss the advancements in artificial intelligence and their potential future applications."
  - source_model: "CultriX/Wernicke-7B-v7" #Bigbench."
    positive_prompts:
      - "If a tree falls in a forest and no one is around to hear it, does it make a sound? Discuss the philosophical implications."
      - "Is it possible for a machine to ever become fully conscious? Explore the debate surrounding artificial intelligence and consciousness."
      - "Debate whether absolute moral truths exist or if morality is subjective."
      - "Imagine a society where aging has been cured. Describe its social structure and potential challenges."
      - "If you could travel back in time, would you be able to change the present? Discuss the paradoxes of time travel."
      - "Is it ethical to create AI that experiences emotions? Discuss the implications for technology and society."
      - "Can a person be moral without being religious? Explore the relationship between morality and religion."
      - "If you had to choose between saving one family member or five strangers, what would you choose and why?"
      - "Is it possible to have free will in a deterministic universe? Discuss philosophical arguments for and against free will."
      - "Imagine a world where humans coexist with intelligent aliens. Describe the cultural, social, and ethical implications."
    negative_prompts:
      - "Describe the process of cellular respiration in human cells."
      - "Explain the economic principles behind supply and demand."
      - "Discuss the causes and effects of climate change on global ecosystems."
      - "Analyze the significance of the French Revolution in shaping modern democracy."
      - "Explain the principles behind nuclear fission and its use in energy production."
      - "Describe the historical events that led to the fall of the Roman Empire."
      - "Discuss the impact of the digital revolution on modern communication."
      - "Analyze the role of enzymes in the human digestive system."
      - "Explain the theory of relativity and its impact on modern physics."
      - "Describe the stages of human embryonic development and their significance."
 
```

## 💻 Usage

```python
!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "jsfs11/MixtureofMerges-MoE-v2"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_jsfs11__MixtureofMerges-MoE-v2)

|             Metric              |Value|
|---------------------------------|----:|
|Avg.                             |74.82|
|AI2 Reasoning Challenge (25-Shot)|72.44|
|HellaSwag (10-Shot)              |88.41|
|MMLU (5-Shot)                    |64.88|
|TruthfulQA (0-shot)              |70.92|
|Winogrande (5-shot)              |83.58|
|GSM8k (5-shot)                   |68.69|