Suparious commited on
Commit
9625d5b
1 Parent(s): 3e1e03c

Add model card

Browse files
Files changed (1) hide show
  1. README.md +235 -0
README.md CHANGED
@@ -1,3 +1,238 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ tags:
3
+ - finetuned
4
+ - quantized
5
+ - 4-bit
6
+ - AWQ
7
+ - transformers
8
+ - pytorch
9
+ - mistral
10
+ - instruct
11
+ - text-generation
12
+ - conversational
13
+ - license:apache-2.0
14
+ - autotrain_compatible
15
+ - endpoints_compatible
16
+ - text-generation-inference
17
+ - finetune
18
+ - chatml
19
+ model-index:
20
+ - name: OpenHercules-2.5-Mistral-7B
21
+ results:
22
+ - task:
23
+ type: text-generation
24
+ name: Text Generation
25
+ dataset:
26
+ name: AI2 Reasoning Challenge (25-Shot)
27
+ type: ai2_arc
28
+ config: ARC-Challenge
29
+ split: test
30
+ args:
31
+ num_few_shot: 25
32
+ metrics:
33
+ - type: acc_norm
34
+ value: 64.25
35
+ name: normalized accuracy
36
+ source:
37
+ url: >-
38
+ https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/OpenHercules-2.5-Mistral-7B
39
+ name: Open LLM Leaderboard
40
+ - task:
41
+ type: text-generation
42
+ name: Text Generation
43
+ dataset:
44
+ name: HellaSwag (10-Shot)
45
+ type: hellaswag
46
+ split: validation
47
+ args:
48
+ num_few_shot: 10
49
+ metrics:
50
+ - type: acc_norm
51
+ value: 84.84
52
+ name: normalized accuracy
53
+ source:
54
+ url: >-
55
+ https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/OpenHercules-2.5-Mistral-7B
56
+ name: Open LLM Leaderboard
57
+ - task:
58
+ type: text-generation
59
+ name: Text Generation
60
+ dataset:
61
+ name: MMLU (5-Shot)
62
+ type: cais/mmlu
63
+ config: all
64
+ split: test
65
+ args:
66
+ num_few_shot: 5
67
+ metrics:
68
+ - type: acc
69
+ value: 64.21
70
+ name: accuracy
71
+ source:
72
+ url: >-
73
+ https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/OpenHercules-2.5-Mistral-7B
74
+ name: Open LLM Leaderboard
75
+ - task:
76
+ type: text-generation
77
+ name: Text Generation
78
+ dataset:
79
+ name: TruthfulQA (0-shot)
80
+ type: truthful_qa
81
+ config: multiple_choice
82
+ split: validation
83
+ args:
84
+ num_few_shot: 0
85
+ metrics:
86
+ - type: mc2
87
+ value: 47.84
88
+ source:
89
+ url: >-
90
+ https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/OpenHercules-2.5-Mistral-7B
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: Winogrande (5-shot)
97
+ type: winogrande
98
+ config: winogrande_xl
99
+ split: validation
100
+ args:
101
+ num_few_shot: 5
102
+ metrics:
103
+ - type: acc
104
+ value: 78.93
105
+ name: accuracy
106
+ source:
107
+ url: >-
108
+ https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/OpenHercules-2.5-Mistral-7B
109
+ name: Open LLM Leaderboard
110
+ - task:
111
+ type: text-generation
112
+ name: Text Generation
113
+ dataset:
114
+ name: GSM8k (5-shot)
115
+ type: gsm8k
116
+ config: main
117
+ split: test
118
+ args:
119
+ num_few_shot: 5
120
+ metrics:
121
+ - type: acc
122
+ value: 59.21
123
+ name: accuracy
124
+ source:
125
+ url: >-
126
+ https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/OpenHercules-2.5-Mistral-7B
127
+ name: Open LLM Leaderboard
128
+ base_model:
129
+ - Locutusque/Hercules-2.5-Mistral-7B
130
+ - teknium/OpenHermes-2.5-Mistral-7B
131
  license: apache-2.0
132
+ language:
133
+ - en
134
+ library_name: transformers
135
+ model_creator: hydra-project
136
+ model_name: OpenHercules-2.5-Mistral-7B
137
+ model_type: mistral
138
+ pipeline_tag: text-generation
139
+ inference: false
140
+ prompt_template: '<|im_start|>system
141
+
142
+ {system_message}<|im_end|>
143
+
144
+ <|im_start|>user
145
+
146
+ {prompt}<|im_end|>
147
+
148
+ <|im_start|>assistant
149
+
150
+ '
151
+ quantized_by: Suparious
152
  ---
153
+ # hydra-project/OpenHercules-2.5-Mistral-7B AWQ
154
+
155
+ - Model creator: [hydra-project](https://huggingface.co/hydra-project)
156
+ - Original model: [OpenHercules-2.5-Mistral-7B](https://huggingface.co/hydra-project/OpenHercules-2.5-Mistral-7B)
157
+
158
+ ## Model Summary
159
+
160
+ OpenHercules-2.5-Mistral-7B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
161
+ * [Locutusque/Hercules-2.5-Mistral-7B](https://huggingface.co/Locutusque/Hercules-2.5-Mistral-7B)
162
+ * [teknium/OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B)
163
+
164
+ ## How to use
165
+
166
+ ### Install the necessary packages
167
+
168
+ ```bash
169
+ pip install --upgrade autoawq autoawq-kernels
170
+ ```
171
+
172
+ ### Example Python code
173
+
174
+ ```python
175
+ from awq import AutoAWQForCausalLM
176
+ from transformers import AutoTokenizer, TextStreamer
177
+
178
+ model_path = "solidrust/OpenHercules-2.5-Mistral-7B-AWQ"
179
+ system_message = "You are Senzu, incarnated as a powerful AI."
180
+
181
+ # Load model
182
+ model = AutoAWQForCausalLM.from_quantized(model_path,
183
+ fuse_layers=True)
184
+ tokenizer = AutoTokenizer.from_pretrained(model_path,
185
+ trust_remote_code=True)
186
+ streamer = TextStreamer(tokenizer,
187
+ skip_prompt=True,
188
+ skip_special_tokens=True)
189
+
190
+ # Convert prompt to tokens
191
+ prompt_template = """\
192
+ <|im_start|>system
193
+ {system_message}<|im_end|>
194
+ <|im_start|>user
195
+ {prompt}<|im_end|>
196
+ <|im_start|>assistant"""
197
+
198
+ prompt = "You're standing on the surface of the Earth. "\
199
+ "You walk one mile south, one mile west and one mile north. "\
200
+ "You end up exactly where you started. Where are you?"
201
+
202
+ tokens = tokenizer(prompt_template.format(system_message=system_message,prompt=prompt),
203
+ return_tensors='pt').input_ids.cuda()
204
+
205
+ # Generate output
206
+ generation_output = model.generate(tokens,
207
+ streamer=streamer,
208
+ max_new_tokens=512)
209
+
210
+ ```
211
+
212
+ ### About AWQ
213
+
214
+ AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
215
+
216
+ AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
217
+
218
+ It is supported by:
219
+
220
+ - [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
221
+ - [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
222
+ - [Hugging Face Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference)
223
+ - [Transformers](https://huggingface.co/docs/transformers) version 4.35.0 and later, from any code or client that supports Transformers
224
+ - [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) - for use from Python code
225
+
226
+ ## Prompt template: ChatML
227
+
228
+ ```plaintext
229
+ <|im_start|>system
230
+ {system_message}<|im_end|>
231
+ <|im_start|>user
232
+ {prompt}<|im_end|>
233
+ <|im_start|>assistant
234
+ ```
235
+
236
+ ## Other Quant formats
237
+
238
+ exl2 and gguf by Bartowski: https://huggingface.co/bartowski/OpenHercules-2.5-Mistral-7B-exl2 https://huggingface.co/bartowski/OpenHercules-2.5-Mistral-7B-GGUF