aashish1904 commited on
Commit
c38554a
β€’
1 Parent(s): 688bfd6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +329 -0
README.md ADDED
@@ -0,0 +1,329 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ language:
5
+ - en
6
+ license: apache-2.0
7
+ library_name: transformers
8
+ tags:
9
+ - merge
10
+ - mergekit
11
+ - lazymergekit
12
+ - bfloat16
13
+ - roleplay
14
+ - creative
15
+ - instruct
16
+ - anvita
17
+ - qwen
18
+ - nerd
19
+ - homer
20
+ - Qandora
21
+ base_model:
22
+ - bunnycore/Qandora-2.5-7B-Creative
23
+ - allknowingroger/HomerSlerp1-7B
24
+ - sethuiyer/Qwen2.5-7B-Anvita
25
+ - fblgit/cybertron-v4-qw7B-MGS
26
+ - jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
27
+ - newsbang/Homer-v0.5-Qwen2.5-7B
28
+ pipeline_tag: text-generation
29
+ model-index:
30
+ - name: Qwen2.5-7B-HomerAnvita-NerdMix
31
+ results:
32
+ - task:
33
+ type: text-generation
34
+ name: Text Generation
35
+ dataset:
36
+ name: IFEval (0-Shot)
37
+ type: HuggingFaceH4/ifeval
38
+ args:
39
+ num_few_shot: 0
40
+ metrics:
41
+ - type: inst_level_strict_acc and prompt_level_strict_acc
42
+ value: 77.08
43
+ name: strict accuracy
44
+ source:
45
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
46
+ name: Open LLM Leaderboard
47
+ - task:
48
+ type: text-generation
49
+ name: Text Generation
50
+ dataset:
51
+ name: BBH (3-Shot)
52
+ type: BBH
53
+ args:
54
+ num_few_shot: 3
55
+ metrics:
56
+ - type: acc_norm
57
+ value: 36.58
58
+ name: normalized accuracy
59
+ source:
60
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
61
+ name: Open LLM Leaderboard
62
+ - task:
63
+ type: text-generation
64
+ name: Text Generation
65
+ dataset:
66
+ name: MATH Lvl 5 (4-Shot)
67
+ type: hendrycks/competition_math
68
+ args:
69
+ num_few_shot: 4
70
+ metrics:
71
+ - type: exact_match
72
+ value: 29.53
73
+ name: exact match
74
+ source:
75
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
76
+ name: Open LLM Leaderboard
77
+ - task:
78
+ type: text-generation
79
+ name: Text Generation
80
+ dataset:
81
+ name: GPQA (0-shot)
82
+ type: Idavidrein/gpqa
83
+ args:
84
+ num_few_shot: 0
85
+ metrics:
86
+ - type: acc_norm
87
+ value: 9.28
88
+ name: acc_norm
89
+ source:
90
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
91
+ name: Open LLM Leaderboard
92
+ - task:
93
+ type: text-generation
94
+ name: Text Generation
95
+ dataset:
96
+ name: MuSR (0-shot)
97
+ type: TAUR-Lab/MuSR
98
+ args:
99
+ num_few_shot: 0
100
+ metrics:
101
+ - type: acc_norm
102
+ value: 14.41
103
+ name: acc_norm
104
+ source:
105
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
106
+ name: Open LLM Leaderboard
107
+ - task:
108
+ type: text-generation
109
+ name: Text Generation
110
+ dataset:
111
+ name: MMLU-PRO (5-shot)
112
+ type: TIGER-Lab/MMLU-Pro
113
+ config: main
114
+ split: test
115
+ args:
116
+ num_few_shot: 5
117
+ metrics:
118
+ - type: acc
119
+ value: 38.13
120
+ name: accuracy
121
+ source:
122
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
123
+ name: Open LLM Leaderboard
124
+
125
+ ---
126
+
127
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
128
+
129
+
130
+ # QuantFactory/Qwen2.5-7B-HomerAnvita-NerdMix-GGUF
131
+ This is quantized version of [ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix](https://huggingface.co/ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix) created using llama.cpp
132
+
133
+ # Original Model Card
134
+
135
+
136
+ # ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix
137
+
138
+ **ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix** is an advanced language model meticulously crafted by merging five pre-trained models using the powerful [mergekit](https://github.com/cg123/mergekit) framework. This fusion leverages the **Model Stock** merge method to combine the creative prowess of **Qandora**, the instructive capabilities of **Qwen-Instruct-Fusion**, the sophisticated blending of **HomerSlerp1**, the mathematical precision of **Cybertron-MGS**, and the uncensored expertise of **Qwen-Nerd**. The resulting model excels in creative text generation, contextual understanding, technical reasoning, and dynamic conversational interactions.
139
+
140
+ ## πŸš€ Merged Models
141
+
142
+ This model merge incorporates the following:
143
+
144
+ - [**bunnycore/Qandora-2.5-7B-Creative**](https://huggingface.co/bunnycore/Qandora-2.5-7B-Creative): Specializes in creative text generation, enhancing the model's ability to produce imaginative and diverse content.
145
+
146
+ - [**allknowingroger/HomerSlerp1-7B**](https://huggingface.co/allknowingroger/HomerSlerp1-7B): Utilizes spherical linear interpolation (SLERP) to blend model weights smoothly, ensuring a harmonious integration of different model attributes.
147
+
148
+ - [**sethuiyer/Qwen2.5-7B-Anvita**](https://huggingface.co/sethuiyer/Qwen2.5-7B-Anvita): Focuses on instruction-following capabilities, improving the model's performance in understanding and executing user commands.
149
+
150
+ - [**fblgit/cybertron-v4-qw7B-MGS**](https://huggingface.co/fblgit/cybertron-v4-qw7B-MGS): Enhances mathematical reasoning and precision, enabling the model to handle complex computational tasks effectively.
151
+
152
+ - [**jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0**](https://huggingface.co/jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0): Provides uncensored expertise and robust technical knowledge, making the model suitable for specialized technical support and information retrieval.
153
+
154
+ - [**newsbang/Homer-v0.5-Qwen2.5-7B**](https://huggingface.co/newsbang/Homer-v0.5-Qwen2.5-7B): Acts as the foundational conversational model, providing robust language comprehension and generation capabilities.
155
+
156
+ ## 🧩 Merge Configuration
157
+
158
+ The configuration below outlines how the models are merged using the **Model Stock** method. This approach ensures a balanced and effective integration of the unique strengths from each source model.
159
+
160
+ ```yaml
161
+ # Merge configuration for ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix using Model Stock
162
+
163
+ models:
164
+ - model: bunnycore/Qandora-2.5-7B-Creative
165
+ - model: allknowingroger/HomerSlerp1-7B
166
+ - model: sethuiyer/Qwen2.5-7B-Anvita
167
+ - model: fblgit/cybertron-v4-qw7B-MGS
168
+ - model: jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0
169
+ merge_method: model_stock
170
+ base_model: newsbang/Homer-v0.5-Qwen2.5-7B
171
+ normalize: false
172
+ int8_mask: true
173
+ dtype: bfloat16
174
+ ```
175
+
176
+ ### Key Parameters
177
+
178
+ - **Merge Method (`merge_method`):** Utilizes the **Model Stock** method, as described in [Model Stock](https://arxiv.org/abs/2403.19522), to effectively combine multiple models by leveraging their strengths.
179
+
180
+ - **Models (`models`):** Specifies the list of models to be merged:
181
+ - **bunnycore/Qandora-2.5-7B-Creative:** Enhances creative text generation.
182
+ - **allknowingroger/HomerSlerp1-7B:** Facilitates smooth blending of model weights using SLERP.
183
+ - **sethuiyer/Qwen2.5-7B-Anvita:** Improves instruction-following capabilities.
184
+ - **fblgit/cybertron-v4-qw7B-MGS:** Enhances mathematical reasoning and precision.
185
+ - **jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0:** Provides uncensored technical expertise.
186
+
187
+ - **Base Model (`base_model`):** Defines the foundational model for the merge, which is **newsbang/Homer-v0.5-Qwen2.5-7B** in this case.
188
+
189
+ - **Normalization (`normalize`):** Set to `false` to retain the original scaling of the model weights during the merge.
190
+
191
+ - **INT8 Mask (`int8_mask`):** Enabled (`true`) to apply INT8 quantization masking, optimizing the model for efficient inference without significant loss in precision.
192
+
193
+ - **Data Type (`dtype`):** Uses `bfloat16` to maintain computational efficiency while ensuring high precision.
194
+
195
+ ## πŸ† Performance Highlights
196
+
197
+ - **Creative Text Generation:** Enhanced ability to produce imaginative and diverse content suitable for creative writing, storytelling, and content creation.
198
+
199
+ - **Instruction Following:** Improved performance in understanding and executing user instructions, making the model more responsive and accurate in task execution.
200
+
201
+ - **Mathematical Reasoning:** Enhanced capability to handle complex computational tasks with high precision, suitable for technical and analytical applications.
202
+
203
+ - **Uncensored Technical Expertise:** Provides robust technical knowledge without content restrictions, making it ideal for specialized technical support and information retrieval.
204
+
205
+
206
+ - **Optimized Inference:** INT8 masking and `bfloat16` data type contribute to efficient computation, enabling faster response times without compromising quality.
207
+
208
+ ## 🎯 Use Case & Applications
209
+
210
+ **ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix** is designed to excel in environments that demand a combination of creative generation, precise instruction following, mathematical reasoning, and technical expertise. Ideal applications include:
211
+
212
+ - **Creative Writing Assistance:** Aiding authors and content creators in generating imaginative narratives, dialogues, and descriptive text.
213
+
214
+ - **Interactive Storytelling and Role-Playing:** Enhancing dynamic and engaging interactions in role-playing games and interactive storytelling platforms.
215
+
216
+ - **Educational Tools and Tutoring Systems:** Providing detailed explanations, answering questions, and assisting in educational content creation with contextual understanding.
217
+
218
+ - **Technical Support and Customer Service:** Offering accurate and contextually relevant responses in technical support scenarios, improving user satisfaction.
219
+
220
+ - **Content Generation for Marketing:** Creating compelling and diverse marketing copy, social media posts, and promotional material with creative flair.
221
+
222
+ - **Mathematical Problem Solving:** Assisting in solving complex mathematical problems and providing step-by-step explanations for educational purposes.
223
+
224
+ - **Technical Documentation and Analysis:** Generating detailed technical documents, reports, and analyses with high precision and clarity.
225
+
226
+ ## πŸ“ Usage
227
+
228
+ To utilize **ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix**, follow the steps below:
229
+
230
+ ### Installation
231
+
232
+ First, install the necessary libraries:
233
+
234
+ ```bash
235
+ pip install -qU transformers accelerate
236
+ ```
237
+
238
+ ### Example Code
239
+
240
+ Below is an example of how to load and use the model for text generation:
241
+
242
+ ```python
243
+ from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
244
+ import torch
245
+
246
+ # Define the model name
247
+ model_name = "ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix"
248
+
249
+ # Load the tokenizer
250
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
251
+
252
+ # Load the model
253
+ model = AutoModelForCausalLM.from_pretrained(
254
+ model_name,
255
+ torch_dtype=torch.bfloat16,
256
+ device_map="auto"
257
+ )
258
+
259
+ # Initialize the pipeline
260
+ text_generator = pipeline(
261
+ "text-generation",
262
+ model=model,
263
+ tokenizer=tokenizer,
264
+ torch_dtype=torch.bfloat16,
265
+ device_map="auto"
266
+ )
267
+
268
+ # Define the input prompt
269
+ prompt = "Explain the significance of artificial intelligence in modern healthcare."
270
+
271
+ # Generate the output
272
+ outputs = text_generator(
273
+ prompt,
274
+ max_new_tokens=150,
275
+ do_sample=True,
276
+ temperature=0.7,
277
+ top_k=50,
278
+ top_p=0.95
279
+ )
280
+
281
+ # Print the generated text
282
+ print(outputs[0]["generated_text"])
283
+ ```
284
+
285
+ ### Notes
286
+
287
+ - **Fine-Tuning:** This merged model may require fine-tuning to optimize performance for specific applications or domains.
288
+
289
+ - **Resource Requirements:** Ensure that your environment has sufficient computational resources, especially GPU-enabled hardware, to handle the model efficiently during inference.
290
+
291
+ - **Customization:** Users can adjust parameters such as `temperature`, `top_k`, and `top_p` to control the creativity and diversity of the generated text.
292
+
293
+
294
+ ## πŸ“œ License
295
+
296
+ This model is open-sourced under the **Apache-2.0 License**.
297
+
298
+ ## πŸ’‘ Tags
299
+
300
+ - `merge`
301
+ - `mergekit`
302
+ - `model_stock`
303
+ - `Qwen`
304
+ - `Homer`
305
+ - `Anvita`
306
+ - `Nerd`
307
+ - `ZeroXClem/Qwen2.5-7B-HomerAnvita-NerdMix`
308
+ - `bunnycore/Qandora-2.5-7B-Creative`
309
+ - `allknowingroger/HomerSlerp1-7B`
310
+ - `sethuiyer/Qwen2.5-7B-Anvita`
311
+ - `fblgit/cybertron-v4-qw7B-MGS`
312
+ - `jeffmeloy/Qwen2.5-7B-nerd-uncensored-v1.0`
313
+ - `newsbang/Homer-v0.5-Qwen2.5-7B`
314
+
315
+ ---
316
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
317
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ZeroXClem__Qwen2.5-7B-HomerAnvita-NerdMix)
318
+
319
+ | Metric |Value|
320
+ |-------------------|----:|
321
+ |Avg. |34.17|
322
+ |IFEval (0-Shot) |77.08|
323
+ |BBH (3-Shot) |36.58|
324
+ |MATH Lvl 5 (4-Shot)|29.53|
325
+ |GPQA (0-shot) | 9.28|
326
+ |MuSR (0-shot) |14.41|
327
+ |MMLU-PRO (5-shot) |38.13|
328
+
329
+