mlabonne commited on
Commit
606e579
โ€ข
1 Parent(s): 02a7893

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +37 -3
  2. beyonder-4x7b-v3.Q5_K_M.gguf +2 -2
README.md CHANGED
@@ -15,14 +15,45 @@ base_model:
15
  - mlabonne/NeuralDaredevil-7B
16
  ---
17
 
18
- # Beyonder-4x7B-v3
19
 
20
- Beyonder-4x7B-v3 is a Mixture of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
 
 
21
  * [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
22
  * [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B)
23
  * [SanjiWatsuki/Kunoichi-DPO-v2-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-DPO-v2-7B)
24
  * [mlabonne/NeuralDaredevil-7B](https://huggingface.co/mlabonne/NeuralDaredevil-7B)
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ## ๐Ÿงฉ Configuration
27
 
28
  ```yaml
@@ -80,4 +111,7 @@ messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in
80
  prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
81
  outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
82
  print(outputs[0]["generated_text"])
83
- ```
 
 
 
 
15
  - mlabonne/NeuralDaredevil-7B
16
  ---
17
 
18
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/61b8e2ba285851687028d395/9XVgxKyuXTQVO5mO-EOd4.jpeg)
19
 
20
+ # ๐Ÿ”ฎ Beyonder-4x7B-v3
21
+
22
+ Beyonder-4x7B-v3 is an improvement over the popular [Beyonder-4x7B-v2](https://huggingface.co/mlabonne/Beyonder-4x7B-v2). It's a Mixture of Experts (MoE) made with the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
23
  * [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B)
24
  * [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B)
25
  * [SanjiWatsuki/Kunoichi-DPO-v2-7B](https://huggingface.co/SanjiWatsuki/Kunoichi-DPO-v2-7B)
26
  * [mlabonne/NeuralDaredevil-7B](https://huggingface.co/mlabonne/NeuralDaredevil-7B)
27
 
28
+ ## ๐Ÿ” Applications
29
+
30
+ This model uses a context window of 8k. I recommend using it with the Mistral Instruct chat template (works perfectly with LM Studio).
31
+
32
+ If you use SillyTavern, you might want to tweak the inference parameters. Here's what LM Studio uses as a reference: `temp` 0.8, `top_k` 40, `top_p` 0.95, `min_p` 0.05, `repeat_penalty` 1.1.
33
+
34
+ Thanks to its four experts, it's a well-rounded model, capable of achieving most tasks. As two experts are always used to generate an answer, every task benefits from other capabilities, like chat with RP, or math with code.
35
+
36
+ ## โšก Quantized models
37
+
38
+ * **GGUF**: https://huggingface.co/mlabonne/Beyonder-4x7B-v3-GGUF
39
+
40
+ ## ๐Ÿ† Evaluation
41
+
42
+ ### Nous
43
+
44
+ Beyonder-4x7B-v3 is one of the best models on Nous' benchmark suite (evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval)) and significantly outperforms the v2. See the entire leaderboard [here](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
45
+
46
+ | Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
47
+ |---|---:|---:|---:|---:|---:|
48
+ | [mlabonne/AlphaMonarch-7B](https://huggingface.co/mlabonne/AlphaMonarch-7B) [๐Ÿ“„](https://gist.github.com/mlabonne/1d33c86824b3a11d2308e36db1ba41c1) | 62.74 | 45.37 | 77.01 | 78.39 | 50.2 |
49
+ | [**mlabonne/Beyonder-4x7B-v3**](https://huggingface.co/mlabonne/Beyonder-4x7B-v3) [๐Ÿ“„](https://gist.github.com/mlabonne/3740020807e559f7057c32e85ce42d92) | **61.91** | **45.85** | **76.67** | **74.98** | **50.12** |
50
+ | [mlabonne/NeuralDaredevil-7B](https://huggingface.co/mlabonne/NeuralDaredevil-7B) [๐Ÿ“„](https://gist.github.com/mlabonne/cbeb077d1df71cb81c78f742f19f4155) | 59.39 | 45.23 | 76.2 | 67.61 | 48.52 |
51
+ | [mlabonne/Beyonder-4x7B-v2](https://huggingface.co/mlabonne/Beyonder-4x7B-v2) [๐Ÿ“„](https://gist.github.com/mlabonne/f73baa140a510a676242f8a4496d05ca) | 57.13 | 45.29 | 75.95 | 60.86 | 46.4 |
52
+
53
+ ### Open LLM Leaderboard
54
+
55
+ Running...
56
+
57
  ## ๐Ÿงฉ Configuration
58
 
59
  ```yaml
 
111
  prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
112
  outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
113
  print(outputs[0]["generated_text"])
114
+ ```
115
+ Output:
116
+
117
+ > A Mixture of Experts (MoE) is a neural network architecture that tackles complex tasks by dividing them into simpler subtasks, delegating each to specialized expert modules. These experts learn to independently handle specific problem aspects. The MoE structure combines their outputs, leveraging their expertise for improved overall performance. This approach promotes modularity, adaptability, and scalability, allowing for better generalization in various applications.
beyonder-4x7b-v3.Q5_K_M.gguf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2d77af454e2951893077e451f70319ae726edd9062267b5269d17fc0cb928c0f
3
- size 17133430752
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:efd38cc2385eff44b1597a198bda6396e9352e536d097ef34c53036a409501cc
3
+ size 17133430816