File size: 14,265 Bytes
181fd24 8515565 181fd24 8515565 181fd24 8515565 181fd24 7e381ea 181fd24 7e381ea 181fd24 7e381ea 181fd24 7e381ea 181fd24 8515565 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 |
---
language:
- en
- de
- fr
- it
- pt
- hi
- es
- th
license: llama3.1
library_name: transformers
tags:
- reflection
- unsloth
- peft
- llama
base_model:
- meta-llama/Meta-Llama-3.1-8B-Instruct
datasets:
- leafspark/DetailedReflection-Claude-v3_5-Sonnet
metrics:
- accuracy
pipeline_tag: text-generation
model-index:
- name: Llama-3.1-8B-MultiReflection-Instruct
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 71.25
name: strict accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=leafspark/Llama-3.1-8B-MultiReflection-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 28.45
name: normalized accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=leafspark/Llama-3.1-8B-MultiReflection-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 12.54
name: exact match
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=leafspark/Llama-3.1-8B-MultiReflection-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 5.7
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=leafspark/Llama-3.1-8B-MultiReflection-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 8.52
name: acc_norm
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=leafspark/Llama-3.1-8B-MultiReflection-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 30.27
name: accuracy
source:
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=leafspark/Llama-3.1-8B-MultiReflection-Instruct
name: Open LLM Leaderboard
---
# Llama-3.1-8B-MultiReflection-Instruct
This model is inspired by OpenAI's o1 reasoning model. The dataset was synthetically generated using Claude 3.5 Sonnet.
### Training
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6604e5b21eb292d6df393365/pGUUKI77dP9NBVdLf1zB0.png)
The training was done on Google Colab's free T4, using unsloth (duration: 52.32 minutes). The configuration is as follows:
- LoRA Rank: 128
- Packing: enabled
- Batch size: 2
- Gradient accumulation steps: 4
- Epoches: 3
- Steps: 30
- Max sequence length: 4096
The training data comprised of 81 examples, each approximatly 3000 tokens.
### Notes
- It tends to produce very verbose and long reasoning responses
- The coherency seems fairly good if you used the recommended sampling parameters
- You should use at least 16k context, as each response is usually 2000-3000 tokens long
**Recommended Sampling Parameters:**
- Temperature: `0.15`
- Min-P: `0.2`
- Top-K: `50`
- Top-P: `1`
- Max Tokens: `uncapped`
- Frequency Penalty: `0.5`
- Presence Penalty: `0.1`
**Recommended System Prompt:**
```
You are an advanced agentic assistant designed to complete tasks and provide general insight. Your responses will be formatted in XML. Start your responses with a <thinking> XML tag. You should write general process of how to complete the task provided by the user in this area. You may add nested tags, such as multiple <thinking> tags. <reflection> tags may also be used, if you've detected any inconsistencies or errors in your thinking. After completing a detailed thinking process, generate a draft response in <draft> tags. Review this response in <reflection> tags again, and refine your response as needed. Remember that you may (and are encouraged to) include multiple thinking, reflection and drafts. Provide your final response in <output> tags.
```
I initially thought the model was going to be medicore, but here's a sample output (quantized Q6_K, the fp16 is probably even better):
```
Can you teach me a cool math proof?
```
```xml
<thinking>
To create an engaging math proof, I should:
1. Choose a well-known and interesting proof
2. Break it down into simple steps
3. Ask the user questions to keep them engaged
4. Use LaTeX for math symbols
5. Provide explanations and context for each step
</thinking>
<draft>
Let's explore the fascinating world of math proofs together. I'd like to introduce you to the "Proof of the Pythagorean Theorem" - a classic and beautiful result in geometry.
Here's the theorem we'll be proving:
$$a^2 + b^2 = c^2$$
where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
</draft>
<reflection>
The draft is a good start, but I should:
1. Provide more context about the proof
2. Explain why this theorem is important
3. Ask more questions to keep the user engaged
4. Use LaTeX for math symbols
</reflection>
<thinking>
Let me revise the draft to address these points:
1. I'll explain why the Pythagorean Theorem is important
2. I'll ask more questions to keep the user engaged
3. I'll use LaTeX for math symbols
4. I'll provide more context about the proof
</thinking>
<draft>
Let's embark on an exciting journey through the world of math proofs. Today, we'll explore the "Proof of the Pythagorean Theorem" - a fundamental result in geometry that has far-reaching implications in various fields, including physics, engineering, and architecture.
Here's the theorem we'll be proving:
$$a^2 + b^2 = c^2$$
where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
This theorem is crucial because it allows us to calculate the length of the hypotenuse (the longest side) of a right triangle, given the lengths of the other two sides. It's a powerful tool in many real-world applications, such as building design, bridge construction, and even computer graphics.
To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
2. What do you think might be a good way to approach this proof? Do you have any ideas?
</draft>
<reflection>
The revised draft is much better, but I should:
1. Provide more details about the proof
2. Explain the concept of similar triangles
3. Use more LaTeX for math symbols
4. Ask more questions to keep the user engaged
</reflection>
<thinking>
Let me revise the draft again to address these points:
1. I'll provide more details about the proof
2. I'll explain the concept of similar triangles
3. I'll use more LaTeX for math symbols
4. I'll ask more questions to keep the user engaged
</thinking>
<draft>
Let's dive into the fascinating world of math proofs. Today, we'll explore the "Proof of the Pythagorean Theorem" - a fundamental result in geometry that has far-reaching implications in various fields, including physics, engineering, and architecture.
Here's the theorem we'll be proving:
$$a^2 + b^2 = c^2$$
where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
This theorem is crucial because it allows us to calculate the length of the hypotenuse (the longest side) of a right triangle, given the lengths of the other two sides. It's a powerful tool in many real-world applications, such as building design, bridge construction, and even computer graphics.
To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
2. What do you think might be a good way to approach this proof? Do you have any ideas?
Let's start by considering two similar triangles. Similar triangles are triangles that have the same shape, but not necessarily the same size. In this case, we'll use the concept of similar triangles to relate the lengths of the sides of the right triangle.
Here's a diagram of the two similar triangles:
$$\begin{array}{ccc}
\text{Triangle } ABC & \text{Triangle } ADE \\
a & \frac{a}{2} \\
b & \frac{b}{2} \\
c & \frac{c}{2} \\
\end{array}$$
Notice that the two triangles are similar because they have the same angles (all right angles in this case). This means that their corresponding sides are proportional.
Now, let's use this similarity to relate the lengths of the sides of the two triangles. We can set up a proportion based on the corresponding sides:
$$\frac{a}{\frac{a}{2}} = \frac{b}{\frac{b}{2}} = \frac{c}{\frac{c}{2}}$$
Simplifying this proportion, we get:
$$2a = \frac{a}{2}$$
$$2b = \frac{b}{2}$$
$$2c = \frac{c}{2}$$
Now, let's square both sides of each equation:
$$4a^2 = \frac{a^2}{4}$$
$$4b^2 = \frac{b^2}{4}$$
$$4c^2 = \frac{c^2}{4}$$
Multiplying both sides of each equation by 4, we get:
$$16a^2 = a^2$$
$$16b^2 = b^2$$
$$16c^2 = c^2$$
Now, let's add the three equations together:
$$16a^2 + 16b^2 + 16c^2 = a^2 + b^2 + c^2$$
Dividing both sides by 16, we get:
$$a^2 + b^2 + c^2 = \frac{1}{16}(a^2 + b^2 + c^2)$$
This equation is true for any right triangle, so we can conclude that:
$$a^2 + b^2 = c^2$$
This is the Pythagorean Theorem!
Now, it's your turn again. Can you think of any real-world applications of this theorem? Do you have any questions about the proof?
</draft>
<reflection>
The revised draft is excellent. It provides a clear and engaging explanation of the Pythagorean Theorem, along with a step-by-step proof. The use of LaTeX for math symbols is also well done. I'm confident that this will be a helpful and enjoyable experience for the user.
</reflection>
<output>
Let's embark on an exciting journey through the world of math proofs. Today, we'll explore the "Proof of the Pythagorean Theorem" - a fundamental result in geometry that has far-reaching implications in various fields, including physics, engineering, and architecture.
Here's the theorem we'll be proving:
$$a^2 + b^2 = c^2$$
where $a$ and $b$ are the lengths of the legs of a right triangle, and $c$ is the length of the hypotenuse.
This theorem is crucial because it allows us to calculate the length of the hypotenuse (the longest side) of a right triangle, given the lengths of the other two sides. It's a powerful tool in many real-world applications, such as building design, bridge construction, and even computer graphics.
To make this proof more engaging, I'll ask you questions along the way. Are you ready to get started?
1. Can you draw a simple right triangle with legs of length $a$ and $b$, and a hypotenuse of length $c$?
2. What do you think might be a good way to approach this proof? Do you have any ideas?
Let's start by considering two similar triangles. Similar triangles are triangles that have the same shape, but not necessarily the same size. In this case, we'll use the concept of similar triangles to relate the lengths of the sides of the right triangle.
Here's a diagram of the two similar triangles:
$$\begin{array}{ccc}
\text{Triangle } ABC & \text{Triangle } ADE \\
a & \frac{a}{2} \\
b & \frac{b}{2} \\
c & \frac{c}{2} \\
\end{array}$$
Notice that the two triangles are similar because they have the same angles (all right angles in this case). This means that their corresponding sides are proportional.
Now, let's use this similarity to relate the lengths of the sides of the two triangles. We can set up a proportion based on the corresponding sides:
$$\frac{a}{\frac{a}{2}} = \frac{b}{\frac{b}{2}} = \frac{c}{\frac{c}{2}}$$
Simplifying this proportion, we get:
$$2a = \frac{a}{2}$$
$$2b = \frac{b}{2}$$
$$2c = \frac{c}{2}$$
Now, let's square both sides of each equation:
$$4a^2 = \frac{a^2}{4}$$
$$4b^2 = \frac{b^2}{4}$$
$$4c^2 = \frac{c^2}{4}$$
Multiplying both sides of each equation by 4, we get:
$$16a^2 = a^2$$
$$16b^2 = b^2$$
$$16c^2 = c^2$$
Now, let's add the three equations together:
$$16a^2 + 16b^2 + 16c^2 = a^2 + b^2 + c^2$$
Dividing both sides by 16, we get:
$$a^2 + b^2 + c^2 = \frac{1}{16}(a^2 + b^2 + c^2)$$
This equation is true for any right triangle, so we can conclude that:
$$a^2 + b^2 = c^2$$
This is the Pythagorean Theorem!
Now, it's your turn again. Can you think of any real-world applications of this theorem? Do you have any questions about the proof?
Remember, this is a proof, not a mathematical exercise. Feel free to ask questions or share your thoughts about the theorem and its implications.
</output>
```
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_leafspark__Llama-3.1-8B-MultiReflection-Instruct)
| Metric |Value|
|-------------------|----:|
|Avg. |26.12|
|IFEval (0-Shot) |71.25|
|BBH (3-Shot) |28.45|
|MATH Lvl 5 (4-Shot)|12.54|
|GPQA (0-shot) | 5.70|
|MuSR (0-shot) | 8.52|
|MMLU-PRO (5-shot) |30.27|
|