RichardErkhov commited on
Commit
db622c6
1 Parent(s): 4dc579f

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +334 -0
README.md ADDED
@@ -0,0 +1,334 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ CosmicBun-8B - GGUF
11
+ - Model creator: https://huggingface.co/aloobun/
12
+ - Original model: https://huggingface.co/aloobun/CosmicBun-8B/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [CosmicBun-8B.Q2_K.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q2_K.gguf) | Q2_K | 2.96GB |
18
+ | [CosmicBun-8B.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.IQ3_XS.gguf) | IQ3_XS | 3.28GB |
19
+ | [CosmicBun-8B.IQ3_S.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.IQ3_S.gguf) | IQ3_S | 3.43GB |
20
+ | [CosmicBun-8B.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q3_K_S.gguf) | Q3_K_S | 3.41GB |
21
+ | [CosmicBun-8B.IQ3_M.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.IQ3_M.gguf) | IQ3_M | 3.52GB |
22
+ | [CosmicBun-8B.Q3_K.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q3_K.gguf) | Q3_K | 3.74GB |
23
+ | [CosmicBun-8B.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q3_K_M.gguf) | Q3_K_M | 3.74GB |
24
+ | [CosmicBun-8B.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q3_K_L.gguf) | Q3_K_L | 4.03GB |
25
+ | [CosmicBun-8B.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.IQ4_XS.gguf) | IQ4_XS | 4.18GB |
26
+ | [CosmicBun-8B.Q4_0.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q4_0.gguf) | Q4_0 | 4.34GB |
27
+ | [CosmicBun-8B.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.IQ4_NL.gguf) | IQ4_NL | 4.38GB |
28
+ | [CosmicBun-8B.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q4_K_S.gguf) | Q4_K_S | 4.37GB |
29
+ | [CosmicBun-8B.Q4_K.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q4_K.gguf) | Q4_K | 4.58GB |
30
+ | [CosmicBun-8B.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q4_K_M.gguf) | Q4_K_M | 4.58GB |
31
+ | [CosmicBun-8B.Q4_1.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q4_1.gguf) | Q4_1 | 4.78GB |
32
+ | [CosmicBun-8B.Q5_0.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q5_0.gguf) | Q5_0 | 5.21GB |
33
+ | [CosmicBun-8B.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q5_K_S.gguf) | Q5_K_S | 5.21GB |
34
+ | [CosmicBun-8B.Q5_K.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q5_K.gguf) | Q5_K | 5.34GB |
35
+ | [CosmicBun-8B.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q5_K_M.gguf) | Q5_K_M | 5.34GB |
36
+ | [CosmicBun-8B.Q5_1.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q5_1.gguf) | Q5_1 | 5.65GB |
37
+ | [CosmicBun-8B.Q6_K.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q6_K.gguf) | Q6_K | 6.14GB |
38
+ | [CosmicBun-8B.Q8_0.gguf](https://huggingface.co/RichardErkhov/aloobun_-_CosmicBun-8B-gguf/blob/main/CosmicBun-8B.Q8_0.gguf) | Q8_0 | 7.95GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ license: mit
46
+ library_name: transformers
47
+ tags:
48
+ - mergekit
49
+ - merge
50
+ - math
51
+ - llama3
52
+ - physics
53
+ - chemistry
54
+ - biology
55
+ - dolphin
56
+ base_model:
57
+ - cognitivecomputations/dolphin-2.9-llama3-8b
58
+ - Weyaxi/Einstein-v6.1-Llama3-8B
59
+ - Locutusque/llama-3-neural-chat-v1-8b
60
+ model-index:
61
+ - name: CosmicBun-8B
62
+ results:
63
+ - task:
64
+ type: text-generation
65
+ name: Text Generation
66
+ dataset:
67
+ name: AI2 Reasoning Challenge (25-Shot)
68
+ type: ai2_arc
69
+ config: ARC-Challenge
70
+ split: test
71
+ args:
72
+ num_few_shot: 25
73
+ metrics:
74
+ - type: acc_norm
75
+ value: 61.86
76
+ name: normalized accuracy
77
+ source:
78
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/CosmicBun-8B
79
+ name: Open LLM Leaderboard
80
+ - task:
81
+ type: text-generation
82
+ name: Text Generation
83
+ dataset:
84
+ name: HellaSwag (10-Shot)
85
+ type: hellaswag
86
+ split: validation
87
+ args:
88
+ num_few_shot: 10
89
+ metrics:
90
+ - type: acc_norm
91
+ value: 84.29
92
+ name: normalized accuracy
93
+ source:
94
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/CosmicBun-8B
95
+ name: Open LLM Leaderboard
96
+ - task:
97
+ type: text-generation
98
+ name: Text Generation
99
+ dataset:
100
+ name: MMLU (5-Shot)
101
+ type: cais/mmlu
102
+ config: all
103
+ split: test
104
+ args:
105
+ num_few_shot: 5
106
+ metrics:
107
+ - type: acc
108
+ value: 65.53
109
+ name: accuracy
110
+ source:
111
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/CosmicBun-8B
112
+ name: Open LLM Leaderboard
113
+ - task:
114
+ type: text-generation
115
+ name: Text Generation
116
+ dataset:
117
+ name: TruthfulQA (0-shot)
118
+ type: truthful_qa
119
+ config: multiple_choice
120
+ split: validation
121
+ args:
122
+ num_few_shot: 0
123
+ metrics:
124
+ - type: mc2
125
+ value: 54.08
126
+ source:
127
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/CosmicBun-8B
128
+ name: Open LLM Leaderboard
129
+ - task:
130
+ type: text-generation
131
+ name: Text Generation
132
+ dataset:
133
+ name: Winogrande (5-shot)
134
+ type: winogrande
135
+ config: winogrande_xl
136
+ split: validation
137
+ args:
138
+ num_few_shot: 5
139
+ metrics:
140
+ - type: acc
141
+ value: 78.85
142
+ name: accuracy
143
+ source:
144
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/CosmicBun-8B
145
+ name: Open LLM Leaderboard
146
+ - task:
147
+ type: text-generation
148
+ name: Text Generation
149
+ dataset:
150
+ name: GSM8k (5-shot)
151
+ type: gsm8k
152
+ config: main
153
+ split: test
154
+ args:
155
+ num_few_shot: 5
156
+ metrics:
157
+ - type: acc
158
+ value: 68.23
159
+ name: accuracy
160
+ source:
161
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/CosmicBun-8B
162
+ name: Open LLM Leaderboard
163
+ ---
164
+ # model
165
+
166
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
167
+
168
+ ### Merge Method
169
+
170
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [Locutusque/llama-3-neural-chat-v1-8b](https://huggingface.co/Locutusque/llama-3-neural-chat-v1-8b) as a base.
171
+
172
+ ### Models Merged
173
+
174
+ The following models were included in the merge:
175
+ * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
176
+ * [Weyaxi/Einstein-v6.1-Llama3-8B](https://huggingface.co/Weyaxi/Einstein-v6.1-Llama3-8B)
177
+
178
+ ### Configuration
179
+
180
+ The following YAML configuration was used to produce this model:
181
+
182
+ ```yaml
183
+ base_model: Locutusque/llama-3-neural-chat-v1-8b
184
+ dtype: bfloat16
185
+ merge_method: dare_ties
186
+ parameters:
187
+ int8_mask: 1.0
188
+ normalize: 0.0
189
+ slices:
190
+ - sources:
191
+ - layer_range: [0, 4]
192
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
193
+ parameters:
194
+ density: 1.0
195
+ weight: 0.6
196
+ - layer_range: [0, 4]
197
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
198
+ parameters:
199
+ density: 0.6
200
+ weight: 0.5
201
+ - layer_range: [0, 4]
202
+ model: Locutusque/llama-3-neural-chat-v1-8b
203
+ parameters:
204
+ density: 1.0
205
+ weight: 0.5
206
+ - sources:
207
+ - layer_range: [4, 8]
208
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
209
+ parameters:
210
+ density: 0.8
211
+ weight: 0.1
212
+ - layer_range: [4, 8]
213
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
214
+ parameters:
215
+ density: 1.0
216
+ weight: 0.2
217
+ - layer_range: [4, 8]
218
+ model: Locutusque/llama-3-neural-chat-v1-8b
219
+ parameters:
220
+ density: 1.0
221
+ weight: 0.7
222
+ - sources:
223
+ - layer_range: [8, 12]
224
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
225
+ parameters:
226
+ density: 0.7
227
+ weight: 0.1
228
+ - layer_range: [8, 12]
229
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
230
+ parameters:
231
+ density: 0.7
232
+ weight: 0.2
233
+ - layer_range: [8, 12]
234
+ model: Locutusque/llama-3-neural-chat-v1-8b
235
+ parameters:
236
+ density: 0.7
237
+ weight: 0.6
238
+ - sources:
239
+ - layer_range: [12, 16]
240
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
241
+ parameters:
242
+ density: 0.9
243
+ weight: 0.2
244
+ - layer_range: [12, 16]
245
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
246
+ parameters:
247
+ density: 0.6
248
+ weight: 0.6
249
+ - layer_range: [12, 16]
250
+ model: Locutusque/llama-3-neural-chat-v1-8b
251
+ parameters:
252
+ density: 0.7
253
+ weight: 0.3
254
+ - sources:
255
+ - layer_range: [16, 20]
256
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
257
+ parameters:
258
+ density: 1.0
259
+ weight: 0.2
260
+ - layer_range: [16, 20]
261
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
262
+ parameters:
263
+ density: 1.0
264
+ weight: 0.2
265
+ - layer_range: [16, 20]
266
+ model: Locutusque/llama-3-neural-chat-v1-8b
267
+ parameters:
268
+ density: 0.9
269
+ weight: 0.4
270
+ - sources:
271
+ - layer_range: [20, 24]
272
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
273
+ parameters:
274
+ density: 0.7
275
+ weight: 0.2
276
+ - layer_range: [20, 24]
277
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
278
+ parameters:
279
+ density: 0.9
280
+ weight: 0.3
281
+ - layer_range: [20, 24]
282
+ model: Locutusque/llama-3-neural-chat-v1-8b
283
+ parameters:
284
+ density: 1.0
285
+ weight: 0.4
286
+ - sources:
287
+ - layer_range: [24, 28]
288
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
289
+ parameters:
290
+ density: 1.0
291
+ weight: 0.4
292
+ - layer_range: [24, 28]
293
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
294
+ parameters:
295
+ density: 0.8
296
+ weight: 0.2
297
+ - layer_range: [24, 28]
298
+ model: Locutusque/llama-3-neural-chat-v1-8b
299
+ parameters:
300
+ density: 0.9
301
+ weight: 0.4
302
+ - sources:
303
+ - layer_range: [28, 32]
304
+ model: cognitivecomputations/dolphin-2.9-llama3-8b
305
+ parameters:
306
+ density: 1.0
307
+ weight: 0.3
308
+ - layer_range: [28, 32]
309
+ model: Weyaxi/Einstein-v6.1-Llama3-8B
310
+ parameters:
311
+ density: 0.9
312
+ weight: 0.2
313
+ - layer_range: [28, 32]
314
+ model: Locutusque/llama-3-neural-chat-v1-8b
315
+ parameters:
316
+ density: 1.0
317
+ weight: 0.3
318
+
319
+ ```
320
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
321
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_aloobun__CosmicBun-8B)
322
+
323
+ | Metric |Value|
324
+ |---------------------------------|----:|
325
+ |Avg. |68.81|
326
+ |AI2 Reasoning Challenge (25-Shot)|61.86|
327
+ |HellaSwag (10-Shot) |84.29|
328
+ |MMLU (5-Shot) |65.53|
329
+ |TruthfulQA (0-shot) |54.08|
330
+ |Winogrande (5-shot) |78.85|
331
+ |GSM8k (5-shot) |68.23|
332
+
333
+
334
+