RichardErkhov commited on
Commit
2d5b216
1 Parent(s): b108ae7

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +255 -0
README.md ADDED
@@ -0,0 +1,255 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ SOLAR-tail-10.7B-Merge-v1.0 - GGUF
11
+ - Model creator: https://huggingface.co/PracticeLLM/
12
+ - Original model: https://huggingface.co/PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q2_K.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q2_K.gguf) | Q2_K | 3.73GB |
18
+ | [SOLAR-tail-10.7B-Merge-v1.0.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.IQ3_XS.gguf) | IQ3_XS | 4.14GB |
19
+ | [SOLAR-tail-10.7B-Merge-v1.0.IQ3_S.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.IQ3_S.gguf) | IQ3_S | 4.37GB |
20
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q3_K_S.gguf) | Q3_K_S | 4.34GB |
21
+ | [SOLAR-tail-10.7B-Merge-v1.0.IQ3_M.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.IQ3_M.gguf) | IQ3_M | 4.51GB |
22
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q3_K.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q3_K.gguf) | Q3_K | 4.84GB |
23
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q3_K_M.gguf) | Q3_K_M | 4.84GB |
24
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q3_K_L.gguf) | Q3_K_L | 5.26GB |
25
+ | [SOLAR-tail-10.7B-Merge-v1.0.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.IQ4_XS.gguf) | IQ4_XS | 5.43GB |
26
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q4_0.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q4_0.gguf) | Q4_0 | 5.66GB |
27
+ | [SOLAR-tail-10.7B-Merge-v1.0.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.IQ4_NL.gguf) | IQ4_NL | 5.72GB |
28
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q4_K_S.gguf) | Q4_K_S | 5.7GB |
29
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q4_K.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q4_K.gguf) | Q4_K | 6.02GB |
30
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q4_K_M.gguf) | Q4_K_M | 6.02GB |
31
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q4_1.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q4_1.gguf) | Q4_1 | 6.27GB |
32
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q5_0.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q5_0.gguf) | Q5_0 | 6.89GB |
33
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q5_K_S.gguf) | Q5_K_S | 6.89GB |
34
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q5_K.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q5_K.gguf) | Q5_K | 7.08GB |
35
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q5_K_M.gguf) | Q5_K_M | 7.08GB |
36
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q5_1.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q5_1.gguf) | Q5_1 | 7.51GB |
37
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q6_K.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q6_K.gguf) | Q6_K | 8.2GB |
38
+ | [SOLAR-tail-10.7B-Merge-v1.0.Q8_0.gguf](https://huggingface.co/RichardErkhov/PracticeLLM_-_SOLAR-tail-10.7B-Merge-v1.0-gguf/blob/main/SOLAR-tail-10.7B-Merge-v1.0.Q8_0.gguf) | Q8_0 | 10.62GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ language:
46
+ - en
47
+ - ko
48
+ license: cc-by-nc-sa-4.0
49
+ pipeline_tag: text-generation
50
+ model-index:
51
+ - name: SOLAR-tail-10.7B-Merge-v1.0
52
+ results:
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: AI2 Reasoning Challenge (25-Shot)
58
+ type: ai2_arc
59
+ config: ARC-Challenge
60
+ split: test
61
+ args:
62
+ num_few_shot: 25
63
+ metrics:
64
+ - type: acc_norm
65
+ value: 66.13
66
+ name: normalized accuracy
67
+ source:
68
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0
69
+ name: Open LLM Leaderboard
70
+ - task:
71
+ type: text-generation
72
+ name: Text Generation
73
+ dataset:
74
+ name: HellaSwag (10-Shot)
75
+ type: hellaswag
76
+ split: validation
77
+ args:
78
+ num_few_shot: 10
79
+ metrics:
80
+ - type: acc_norm
81
+ value: 86.54
82
+ name: normalized accuracy
83
+ source:
84
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0
85
+ name: Open LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: MMLU (5-Shot)
91
+ type: cais/mmlu
92
+ config: all
93
+ split: test
94
+ args:
95
+ num_few_shot: 5
96
+ metrics:
97
+ - type: acc
98
+ value: 66.52
99
+ name: accuracy
100
+ source:
101
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0
102
+ name: Open LLM Leaderboard
103
+ - task:
104
+ type: text-generation
105
+ name: Text Generation
106
+ dataset:
107
+ name: TruthfulQA (0-shot)
108
+ type: truthful_qa
109
+ config: multiple_choice
110
+ split: validation
111
+ args:
112
+ num_few_shot: 0
113
+ metrics:
114
+ - type: mc2
115
+ value: 60.57
116
+ source:
117
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0
118
+ name: Open LLM Leaderboard
119
+ - task:
120
+ type: text-generation
121
+ name: Text Generation
122
+ dataset:
123
+ name: Winogrande (5-shot)
124
+ type: winogrande
125
+ config: winogrande_xl
126
+ split: validation
127
+ args:
128
+ num_few_shot: 5
129
+ metrics:
130
+ - type: acc
131
+ value: 84.77
132
+ name: accuracy
133
+ source:
134
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0
135
+ name: Open LLM Leaderboard
136
+ - task:
137
+ type: text-generation
138
+ name: Text Generation
139
+ dataset:
140
+ name: GSM8k (5-shot)
141
+ type: gsm8k
142
+ config: main
143
+ split: test
144
+ args:
145
+ num_few_shot: 5
146
+ metrics:
147
+ - type: acc
148
+ value: 65.58
149
+ name: accuracy
150
+ source:
151
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0
152
+ name: Open LLM Leaderboard
153
+ ---
154
+
155
+ # **SOLAR-tail-10.7B-Merge-v1.0**
156
+
157
+ ## Model Details
158
+
159
+ **Model Developers** Kyujin Han (kyujinpy)
160
+
161
+ **Method**
162
+ Using [Mergekit](https://github.com/cg123/mergekit).
163
+ - [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0)
164
+ - [Yhyu13/LMCocktail-10.7B-v1](Yhyu13/LMCocktail-10.7B-v1)
165
+
166
+ **Merge config**
167
+ ```
168
+ slices:
169
+ - sources:
170
+ - model: upstage/SOLAR-10.7B-v1.0
171
+ layer_range: [0, 48]
172
+ - model: Yhyu13/LMCocktail-10.7B-v1
173
+ layer_range: [0, 48]
174
+
175
+ merge_method: slerp
176
+ base_model: upstage/SOLAR-10.7B-v1.0
177
+
178
+ parameters:
179
+ t:
180
+ - filter: self_attn
181
+ value: [0, 0.5, 0.3, 0.7, 1]
182
+ - filter: mlp
183
+ value: [1, 0.5, 0.7, 0.3, 0]
184
+ - value: 0.5 # fallback for rest of tensors
185
+ tokenizer_source: union
186
+
187
+ dtype: float16
188
+ ```
189
+
190
+ # **Model Benchmark**
191
+
192
+ ## Open Ko leaderboard
193
+ - Follow up as [Ko-link](https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard).
194
+
195
+ | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Ko-CommonGenV2 |
196
+ | --- | --- | --- | --- | --- | --- | --- |
197
+ | PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0 | 48.32 | 45.73 | 56.97 | 38.77 | 38.75 | 61.16 |
198
+ | jjourney1125/M-SOLAR-10.7B-v1.0 | 55.15 | 49.57 | 60.12 | 54.60 | 49.23 | 62.22 |
199
+
200
+ - Follow up as [En-link](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
201
+ | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K |
202
+ | --- | --- | --- | --- | --- | --- | --- | --- |
203
+ | PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0 | 71.68 | 66.13 | 86.54 | **66.52** | 60.57 | **84.77** | **65.58** |
204
+ | kyujinpy/Sakura-SOLAR-Instruct | **74.40** | **70.99** | **88.42** | 66.33 | **71.79** | 83.66 | 65.20 |
205
+
206
+
207
+ ## lm-evaluation-harness
208
+ ```
209
+ gpt2 (pretrained=PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0), limit: None, provide_description: False, num_fewshot: 0, batch_size: None
210
+ | Task |Version| Metric |Value | |Stderr|
211
+ |----------------|------:|--------|-----:|---|-----:|
212
+ |kobest_boolq | 0|acc |0.5021|± |0.0133|
213
+ | | |macro_f1|0.3343|± |0.0059|
214
+ |kobest_copa | 0|acc |0.6220|± |0.0153|
215
+ | | |macro_f1|0.6217|± |0.0154|
216
+ |kobest_hellaswag| 0|acc |0.4380|± |0.0222|
217
+ | | |acc_norm|0.5380|± |0.0223|
218
+ | | |macro_f1|0.4366|± |0.0222|
219
+ |kobest_sentineg | 0|acc |0.4962|± |0.0251|
220
+ | | |macro_f1|0.3316|± |0.0113|
221
+ ```
222
+
223
+
224
+ # Implementation Code
225
+ ```python
226
+ ### KO-Platypus
227
+ from transformers import AutoModelForCausalLM, AutoTokenizer
228
+ import torch
229
+
230
+ repo = "PracticeLLM/SOLAR-tail-10.7B-Merge-v1.0"
231
+ OpenOrca = AutoModelForCausalLM.from_pretrained(
232
+ repo,
233
+ return_dict=True,
234
+ torch_dtype=torch.float16,
235
+ device_map='auto'
236
+ )
237
+ OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)
238
+ ```
239
+
240
+ ---
241
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
242
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_PracticeLLM__SOLAR-tail-10.7B-Merge-v1.0)
243
+
244
+ | Metric |Value|
245
+ |---------------------------------|----:|
246
+ |Avg. |71.68|
247
+ |AI2 Reasoning Challenge (25-Shot)|66.13|
248
+ |HellaSwag (10-Shot) |86.54|
249
+ |MMLU (5-Shot) |66.52|
250
+ |TruthfulQA (0-shot) |60.57|
251
+ |Winogrande (5-shot) |84.77|
252
+ |GSM8k (5-shot) |65.58|
253
+
254
+
255
+