munish0838 commited on
Commit
da41e63
1 Parent(s): dba71ed

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +206 -0
README.md ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: other
5
+ tags:
6
+ - merge
7
+ - mergekit
8
+ - lazymergekit
9
+ base_model:
10
+ - NousResearch/Meta-Llama-3-8B-Instruct
11
+ - mlabonne/OrpoLlama-3-8B
12
+ - cognitivecomputations/dolphin-2.9-llama3-8b
13
+ - Locutusque/llama-3-neural-chat-v1-8b
14
+ - cloudyu/Meta-Llama-3-8B-Instruct-DPO
15
+ - vicgalle/Configurable-Llama-3-8B-v0.3
16
+ model-index:
17
+ - name: ChimeraLlama-3-8B-v2
18
+ results:
19
+ - task:
20
+ type: text-generation
21
+ name: Text Generation
22
+ dataset:
23
+ name: IFEval (0-Shot)
24
+ type: HuggingFaceH4/ifeval
25
+ args:
26
+ num_few_shot: 0
27
+ metrics:
28
+ - type: inst_level_strict_acc and prompt_level_strict_acc
29
+ value: 44.69
30
+ name: strict accuracy
31
+ source:
32
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v2
33
+ name: Open LLM Leaderboard
34
+ - task:
35
+ type: text-generation
36
+ name: Text Generation
37
+ dataset:
38
+ name: BBH (3-Shot)
39
+ type: BBH
40
+ args:
41
+ num_few_shot: 3
42
+ metrics:
43
+ - type: acc_norm
44
+ value: 28.48
45
+ name: normalized accuracy
46
+ source:
47
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v2
48
+ name: Open LLM Leaderboard
49
+ - task:
50
+ type: text-generation
51
+ name: Text Generation
52
+ dataset:
53
+ name: MATH Lvl 5 (4-Shot)
54
+ type: hendrycks/competition_math
55
+ args:
56
+ num_few_shot: 4
57
+ metrics:
58
+ - type: exact_match
59
+ value: 8.31
60
+ name: exact match
61
+ source:
62
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v2
63
+ name: Open LLM Leaderboard
64
+ - task:
65
+ type: text-generation
66
+ name: Text Generation
67
+ dataset:
68
+ name: GPQA (0-shot)
69
+ type: Idavidrein/gpqa
70
+ args:
71
+ num_few_shot: 0
72
+ metrics:
73
+ - type: acc_norm
74
+ value: 4.7
75
+ name: acc_norm
76
+ source:
77
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v2
78
+ name: Open LLM Leaderboard
79
+ - task:
80
+ type: text-generation
81
+ name: Text Generation
82
+ dataset:
83
+ name: MuSR (0-shot)
84
+ type: TAUR-Lab/MuSR
85
+ args:
86
+ num_few_shot: 0
87
+ metrics:
88
+ - type: acc_norm
89
+ value: 5.25
90
+ name: acc_norm
91
+ source:
92
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v2
93
+ name: Open LLM Leaderboard
94
+ - task:
95
+ type: text-generation
96
+ name: Text Generation
97
+ dataset:
98
+ name: MMLU-PRO (5-shot)
99
+ type: TIGER-Lab/MMLU-Pro
100
+ config: main
101
+ split: test
102
+ args:
103
+ num_few_shot: 5
104
+ metrics:
105
+ - type: acc
106
+ value: 28.54
107
+ name: accuracy
108
+ source:
109
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v2
110
+ name: Open LLM Leaderboard
111
+
112
+ ---
113
+
114
+ ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
115
+
116
+ # QuantFactory/ChimeraLlama-3-8B-v2-GGUF
117
+ This is quantized version of [mlabonne/ChimeraLlama-3-8B-v2](https://huggingface.co/mlabonne/ChimeraLlama-3-8B-v2) created using llama.cpp
118
+
119
+ # Original Model Card
120
+
121
+
122
+ # ChimeraLlama-3-8B-v2
123
+
124
+ ChimeraLlama-3-8B-v2 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
125
+ * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
126
+ * [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B)
127
+ * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
128
+ * [Locutusque/llama-3-neural-chat-v1-8b](https://huggingface.co/Locutusque/llama-3-neural-chat-v1-8b)
129
+ * [cloudyu/Meta-Llama-3-8B-Instruct-DPO](https://huggingface.co/cloudyu/Meta-Llama-3-8B-Instruct-DPO)
130
+ * [vicgalle/Configurable-Llama-3-8B-v0.3](https://huggingface.co/vicgalle/Configurable-Llama-3-8B-v0.3)
131
+
132
+ ## 🧩 Configuration
133
+
134
+ ```yaml
135
+ models:
136
+ - model: NousResearch/Meta-Llama-3-8B
137
+ # No parameters necessary for base model
138
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
139
+ parameters:
140
+ density: 0.6
141
+ weight: 0.55
142
+ - model: mlabonne/OrpoLlama-3-8B
143
+ parameters:
144
+ density: 0.55
145
+ weight: 0.05
146
+ - model: cognitivecomputations/dolphin-2.9-llama3-8b
147
+ parameters:
148
+ density: 0.55
149
+ weight: 0.1
150
+ - model: Locutusque/llama-3-neural-chat-v1-8b
151
+ parameters:
152
+ density: 0.55
153
+ weight: 0.05
154
+ - model: cloudyu/Meta-Llama-3-8B-Instruct-DPO
155
+ parameters:
156
+ density: 0.55
157
+ weight: 0.15
158
+ - model: vicgalle/Configurable-Llama-3-8B-v0.3
159
+ parameters:
160
+ density: 0.55
161
+ weight: 0.1
162
+ merge_method: dare_ties
163
+ base_model: NousResearch/Meta-Llama-3-8B
164
+ parameters:
165
+ int8_mask: true
166
+ dtype: float16
167
+ ```
168
+
169
+ ## 💻 Usage
170
+
171
+ ```python
172
+ !pip install -qU transformers accelerate
173
+
174
+ from transformers import AutoTokenizer
175
+ import transformers
176
+ import torch
177
+
178
+ model = "mlabonne/ChimeraLlama-3-8B-v2"
179
+ messages = [{"role": "user", "content": "What is a large language model?"}]
180
+
181
+ tokenizer = AutoTokenizer.from_pretrained(model)
182
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
183
+ pipeline = transformers.pipeline(
184
+ "text-generation",
185
+ model=model,
186
+ torch_dtype=torch.float16,
187
+ device_map="auto",
188
+ )
189
+
190
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
191
+ print(outputs[0]["generated_text"])
192
+ ```
193
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
194
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_mlabonne__ChimeraLlama-3-8B-v2)
195
+
196
+ | Metric |Value|
197
+ |-------------------|----:|
198
+ |Avg. |19.99|
199
+ |IFEval (0-Shot) |44.69|
200
+ |BBH (3-Shot) |28.48|
201
+ |MATH Lvl 5 (4-Shot)| 8.31|
202
+ |GPQA (0-shot) | 4.70|
203
+ |MuSR (0-shot) | 5.25|
204
+ |MMLU-PRO (5-shot) |28.54|
205
+
206
+