leaderboard-pr-bot
commited on
Commit
•
8c7e334
1
Parent(s):
b8bd85e
Adding Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr
The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions
README.md
CHANGED
@@ -1,5 +1,100 @@
|
|
1 |
---
|
2 |
license: llama3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
This is my first MOE model based on Llama-3 70B.
|
5 |
|
@@ -9,4 +104,17 @@ example output
|
|
9 |
|
10 |
```
|
11 |
<|begin_of_text|>write me a story about peppa pig. Peppa Pig was a very busy pig. She had a lot of things to do today. She needed to get ready for a very important event, the annual mud festival. This year, she was going to be the main judge of the muddy puddle contest. She was very excited and couldn't wait to see all the muddy puddles that everyone would bring. But first, she needed to get ready. She took a bath, put on her favorite dress, and did her hair. When she was all ready, she headed downstairs to the kitchen, where her mom was making breakfast. She had made Peppa's favorite, pancakes with strawberries and whipped cream. Peppa was so happy and sat down to eat her breakfast. After she finished eating, she headed out the door to go to the festival. When she arrived, she saw all of her friends waiting for her. There was Suzy Sheep, Danny Dog, and Emily Elephant. They all said hello and Peppa was so happy to see them. The festival was about to start and Peppa took her place at the judges table. The first contestant was Suzy Sheep and she had brought a puddle that was very muddy and squelchy. Peppa loved it and gave it a score of 9 out of 10. Next was Danny Dog and he had brought a puddle that was even bigger and muddier than Suzy's. Peppa was very impressed and gave it a score of 9.5 out of 10. Then it was Emily Elephant's turn and she had brought a puddle that was so muddy, it looked like a swamp. Peppa was amazed and gave it a score of 10 out of 10. After all the contestants had presented their puddles, Peppa had to make a very difficult decision. She had to choose the winner of the muddy puddle contest. She thought for a long time and finally, she made her decision. The winner was Emily Elephant. Everyone cheered and clapped for Emily and Peppa gave her a big hug and a trophy. The festival was a huge success and Peppa was so happy to have been a part of it. She couldn't wait to do it all again next year. The end.<|eot_id|> [end of text]
|
12 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: llama3
|
3 |
+
model-index:
|
4 |
+
- name: Llama-3-70Bx2-MOE
|
5 |
+
results:
|
6 |
+
- task:
|
7 |
+
type: text-generation
|
8 |
+
name: Text Generation
|
9 |
+
dataset:
|
10 |
+
name: IFEval (0-Shot)
|
11 |
+
type: HuggingFaceH4/ifeval
|
12 |
+
args:
|
13 |
+
num_few_shot: 0
|
14 |
+
metrics:
|
15 |
+
- type: inst_level_strict_acc and prompt_level_strict_acc
|
16 |
+
value: 54.82
|
17 |
+
name: strict accuracy
|
18 |
+
source:
|
19 |
+
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
|
20 |
+
name: Open LLM Leaderboard
|
21 |
+
- task:
|
22 |
+
type: text-generation
|
23 |
+
name: Text Generation
|
24 |
+
dataset:
|
25 |
+
name: BBH (3-Shot)
|
26 |
+
type: BBH
|
27 |
+
args:
|
28 |
+
num_few_shot: 3
|
29 |
+
metrics:
|
30 |
+
- type: acc_norm
|
31 |
+
value: 51.42
|
32 |
+
name: normalized accuracy
|
33 |
+
source:
|
34 |
+
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
|
35 |
+
name: Open LLM Leaderboard
|
36 |
+
- task:
|
37 |
+
type: text-generation
|
38 |
+
name: Text Generation
|
39 |
+
dataset:
|
40 |
+
name: MATH Lvl 5 (4-Shot)
|
41 |
+
type: hendrycks/competition_math
|
42 |
+
args:
|
43 |
+
num_few_shot: 4
|
44 |
+
metrics:
|
45 |
+
- type: exact_match
|
46 |
+
value: 19.86
|
47 |
+
name: exact match
|
48 |
+
source:
|
49 |
+
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
|
50 |
+
name: Open LLM Leaderboard
|
51 |
+
- task:
|
52 |
+
type: text-generation
|
53 |
+
name: Text Generation
|
54 |
+
dataset:
|
55 |
+
name: GPQA (0-shot)
|
56 |
+
type: Idavidrein/gpqa
|
57 |
+
args:
|
58 |
+
num_few_shot: 0
|
59 |
+
metrics:
|
60 |
+
- type: acc_norm
|
61 |
+
value: 19.13
|
62 |
+
name: acc_norm
|
63 |
+
source:
|
64 |
+
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
|
65 |
+
name: Open LLM Leaderboard
|
66 |
+
- task:
|
67 |
+
type: text-generation
|
68 |
+
name: Text Generation
|
69 |
+
dataset:
|
70 |
+
name: MuSR (0-shot)
|
71 |
+
type: TAUR-Lab/MuSR
|
72 |
+
args:
|
73 |
+
num_few_shot: 0
|
74 |
+
metrics:
|
75 |
+
- type: acc_norm
|
76 |
+
value: 20.85
|
77 |
+
name: acc_norm
|
78 |
+
source:
|
79 |
+
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
|
80 |
+
name: Open LLM Leaderboard
|
81 |
+
- task:
|
82 |
+
type: text-generation
|
83 |
+
name: Text Generation
|
84 |
+
dataset:
|
85 |
+
name: MMLU-PRO (5-shot)
|
86 |
+
type: TIGER-Lab/MMLU-Pro
|
87 |
+
config: main
|
88 |
+
split: test
|
89 |
+
args:
|
90 |
+
num_few_shot: 5
|
91 |
+
metrics:
|
92 |
+
- type: acc
|
93 |
+
value: 46.02
|
94 |
+
name: accuracy
|
95 |
+
source:
|
96 |
+
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=cloudyu/Llama-3-70Bx2-MOE
|
97 |
+
name: Open LLM Leaderboard
|
98 |
---
|
99 |
This is my first MOE model based on Llama-3 70B.
|
100 |
|
|
|
104 |
|
105 |
```
|
106 |
<|begin_of_text|>write me a story about peppa pig. Peppa Pig was a very busy pig. She had a lot of things to do today. She needed to get ready for a very important event, the annual mud festival. This year, she was going to be the main judge of the muddy puddle contest. She was very excited and couldn't wait to see all the muddy puddles that everyone would bring. But first, she needed to get ready. She took a bath, put on her favorite dress, and did her hair. When she was all ready, she headed downstairs to the kitchen, where her mom was making breakfast. She had made Peppa's favorite, pancakes with strawberries and whipped cream. Peppa was so happy and sat down to eat her breakfast. After she finished eating, she headed out the door to go to the festival. When she arrived, she saw all of her friends waiting for her. There was Suzy Sheep, Danny Dog, and Emily Elephant. They all said hello and Peppa was so happy to see them. The festival was about to start and Peppa took her place at the judges table. The first contestant was Suzy Sheep and she had brought a puddle that was very muddy and squelchy. Peppa loved it and gave it a score of 9 out of 10. Next was Danny Dog and he had brought a puddle that was even bigger and muddier than Suzy's. Peppa was very impressed and gave it a score of 9.5 out of 10. Then it was Emily Elephant's turn and she had brought a puddle that was so muddy, it looked like a swamp. Peppa was amazed and gave it a score of 10 out of 10. After all the contestants had presented their puddles, Peppa had to make a very difficult decision. She had to choose the winner of the muddy puddle contest. She thought for a long time and finally, she made her decision. The winner was Emily Elephant. Everyone cheered and clapped for Emily and Peppa gave her a big hug and a trophy. The festival was a huge success and Peppa was so happy to have been a part of it. She couldn't wait to do it all again next year. The end.<|eot_id|> [end of text]
|
107 |
+
```
|
108 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
109 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_cloudyu__Llama-3-70Bx2-MOE)
|
110 |
+
|
111 |
+
| Metric |Value|
|
112 |
+
|-------------------|----:|
|
113 |
+
|Avg. |35.35|
|
114 |
+
|IFEval (0-Shot) |54.82|
|
115 |
+
|BBH (3-Shot) |51.42|
|
116 |
+
|MATH Lvl 5 (4-Shot)|19.86|
|
117 |
+
|GPQA (0-shot) |19.13|
|
118 |
+
|MuSR (0-shot) |20.85|
|
119 |
+
|MMLU-PRO (5-shot) |46.02|
|
120 |
+
|