RDson commited on
Commit
8d98718
1 Parent(s): 3a1700a

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -0
README.md ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - moe
4
+ - llama
5
+ - '3'
6
+ - llama 3
7
+ - 4x8b
8
+ ---
9
+ <img src="https://i.imgur.com/MlnauLb.jpeg" width="640"/>
10
+
11
+ # Llama-3-Peach-Instruct-4x8B-MoE
12
+
13
+ GGUF files are available here: [RDson/Llama-3-Peach-Instruct-4x8B-MoE-GGUF](https://huggingface.co/RDson/Llama-3-Peach-Instruct-4x8B-MoE-GGUF).
14
+
15
+ This is a experimental MoE created using Mergekit from
16
+ * [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
17
+ * [Salesforce/SFR-Iterative-DPO-LLaMA-3-8B-R](https://huggingface.co/Salesforce/SFR-Iterative-DPO-LLaMA-3-8B-R)
18
+ * [NousResearch/Hermes-2-Theta-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B)
19
+ * [rombodawg/Llama-3-8B-Instruct-Coder](https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder)
20
+
21
+ Mergekit yaml file:
22
+ ```
23
+ base_model: Meta-Llama-3-8B-Instruct
24
+ experts:
25
+ - source_model: Meta-Llama-3-8B-Instruct
26
+ positive_prompts:
27
+ - "explain"
28
+ - "chat"
29
+ - "assistant"
30
+ - "think"
31
+ - "roleplay"
32
+ - "versatile"
33
+ - "helpful"
34
+ - "factual"
35
+ - "integrated"
36
+ - "adaptive"
37
+ - "comprehensive"
38
+ - "balanced"
39
+ negative_prompts:
40
+ - "specialized"
41
+ - "narrow"
42
+ - "focused"
43
+ - "limited"
44
+ - "specific"
45
+ - source_model: Llama-3-8B-Instruct-Coder
46
+ positive_prompts:
47
+ - "python"
48
+ - "math"
49
+ - "solve"
50
+ - "code"
51
+ - "programming"
52
+ - "javascript"
53
+ - "algorithm"
54
+ - "factual"
55
+ negative_prompts:
56
+ - "sorry"
57
+ - "cannot"
58
+ - "concise"
59
+ - "imaginative"
60
+ - "creative"
61
+ - source_model: SFR-Iterative-DPO-LLaMA-3-8B-R
62
+ positive_prompts:
63
+ - "AI"
64
+ - "instructive"
65
+ - "chat"
66
+ - "assistant"
67
+ - "clear"
68
+ - "directive"
69
+ - "helpful"
70
+ - "informative"
71
+ - source_model: Hermes-2-Theta-Llama-3-8B
72
+ positive_prompts:
73
+ - "chat"
74
+ - "assistant"
75
+ - "analytical"
76
+ - "accurate"
77
+ - "code"
78
+ - "logical"
79
+ - "knowledgeable"
80
+ - "precise"
81
+ - "calculate"
82
+ - "compute"
83
+ - "solve"
84
+ - "work"
85
+ - "python"
86
+ - "javascript"
87
+ - "programming"
88
+ - "algorithm"
89
+ - "tell me"
90
+ - "assistant"
91
+ - "factual"
92
+ negative_prompts:
93
+ - "abstract"
94
+ - "artistic"
95
+ - "emotional"
96
+ - "mistake"
97
+ - "inaccurate"
98
+ gate_mode: hidden
99
+ dtype: float16
100
+ ```
101
+ Some inspiration for the Mergekit yaml file is from [LoneStriker/Umbra-MoE-4x10.7-2.4bpw-h6-exl2](https://huggingface.co/LoneStriker/Umbra-MoE-4x10.7-2.4bpw-h6-exl2).