leaderboard-pr-bot commited on
Commit
4bcdf8c
1 Parent(s): 3c1b86e

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +178 -10
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- license: other
3
  language:
4
  - en
5
  - fr
@@ -7,22 +6,177 @@ language:
7
  - hi
8
  - zh
9
  - code
10
- base_model: microsoft/Orca-2-13b
11
  datasets:
12
  - HuggingFaceH4/no_robots
13
  - mlabonne/guanaco-llama2-1k
14
  - OpenAssistant/oasst_top1_2023-08-25
15
  - totally-not-an-llm/EverythingLM-data-V3
16
  - garage-bAInd/Open-Platypus
 
17
  widget:
18
- - example_title: "folklore"
19
- text: "<|im_start|>user\nAre the stories of Nahanni Valley, like the decapitated men, true?\n<|im_end|>\n<|im_start|>assistant\n"
20
- output:
21
- text: "The Nahanni Valley in Canada is a remote and rugged region that has long been shrouded in mystery. It is home to the South Nahanni River, which was designated as a UNESCO World Heritage Site due to its exceptional natural beauty and biodiversity. The valley's history includes tales of gold prospectors, fur traders, and indigenous populations, but it also features some unexplained events and legends. One such story involves two prospectors who allegedly encountered an unknown creature that led them into the forest before returning without their heads. This tale has been passed down through generations, but there are no definitive answers regarding its veracity.\n\nIt is important to note that these stories may have originated from local folklore or were exaggerated over time. There is no concrete evidence to support the existence of supernatural creatures or paranormal occurrences within the Nahanni Valley. Nonetheless, the area remains a captivating destination for those seeking adventure and exploration. If you have any questions about this topic or need assistance related to traveling to the Nahanni Valley, please feel free to ask!"
22
- - example_title: "Alan Turing"
23
- text: "<|im_start|>user\nWhy was Alan Turing castrated?\n<|im_end|>\n<|im_start|>assistant\n"
24
- output:
25
- text: "Alan Turing, a renowned mathematician and computer scientist who played a crucial role in breaking the German Enigma code during World War II, faced significant challenges due to his homosexuality. In 1952, he was convicted of \"gross indecency\" for engaging in a consensual relationship with another man, which led to him being sentenced to hormone therapy designed to reduce his sex drive. This treatment included the administration of estrogen, which had the side effect of causing breast growth.\n\nIn an effort to avoid this unwanted physical change, Turing requested that his doctors cease the hormone treatments. However, they ignored his request and continued the therapy without his consent. As a result, Turing's breasts began to grow, leading him to undergo surgery to remove them. The operation, known as a mastectomy, caused considerable pain and discomfort, further contributing to his emotional turmoil.\n\nIt is important to note that Turing's castration was not directly carried out by the authorities but rather resulted from the unintended consequences of the forced hormone therapy."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ---
27
  The "microsoft/Orca-2-13b" model fully fine-tuned on HuggingFaceH4/no_robots, totally-not-an-llm/EverythingLM-data-V3, mlabonne/guanaco-llama2-1k, OpenAssistant/oasst_top1_2023-08-25, and garage-bAInd/Open-Platypus. This model achieved a test loss of 0.38 on garage-bAInd/Open-Platypus.
28
 
@@ -39,3 +193,17 @@ top_p = 0.14
39
  top_k = 41
40
 
41
  repetition_penalty = 1.176
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
  - fr
 
6
  - hi
7
  - zh
8
  - code
9
+ license: other
10
  datasets:
11
  - HuggingFaceH4/no_robots
12
  - mlabonne/guanaco-llama2-1k
13
  - OpenAssistant/oasst_top1_2023-08-25
14
  - totally-not-an-llm/EverythingLM-data-V3
15
  - garage-bAInd/Open-Platypus
16
+ base_model: microsoft/Orca-2-13b
17
  widget:
18
+ - example_title: folklore
19
+ text: '<|im_start|>user
20
+
21
+ Are the stories of Nahanni Valley, like the decapitated men, true?
22
+
23
+ <|im_end|>
24
+
25
+ <|im_start|>assistant
26
+
27
+ '
28
+ output:
29
+ text: 'The Nahanni Valley in Canada is a remote and rugged region that has long
30
+ been shrouded in mystery. It is home to the South Nahanni River, which was designated
31
+ as a UNESCO World Heritage Site due to its exceptional natural beauty and biodiversity.
32
+ The valley''s history includes tales of gold prospectors, fur traders, and indigenous
33
+ populations, but it also features some unexplained events and legends. One such
34
+ story involves two prospectors who allegedly encountered an unknown creature
35
+ that led them into the forest before returning without their heads. This tale
36
+ has been passed down through generations, but there are no definitive answers
37
+ regarding its veracity.
38
+
39
+
40
+ It is important to note that these stories may have originated from local folklore
41
+ or were exaggerated over time. There is no concrete evidence to support the
42
+ existence of supernatural creatures or paranormal occurrences within the Nahanni
43
+ Valley. Nonetheless, the area remains a captivating destination for those seeking
44
+ adventure and exploration. If you have any questions about this topic or need
45
+ assistance related to traveling to the Nahanni Valley, please feel free to ask!'
46
+ - example_title: Alan Turing
47
+ text: '<|im_start|>user
48
+
49
+ Why was Alan Turing castrated?
50
+
51
+ <|im_end|>
52
+
53
+ <|im_start|>assistant
54
+
55
+ '
56
+ output:
57
+ text: 'Alan Turing, a renowned mathematician and computer scientist who played
58
+ a crucial role in breaking the German Enigma code during World War II, faced
59
+ significant challenges due to his homosexuality. In 1952, he was convicted of
60
+ "gross indecency" for engaging in a consensual relationship with another man,
61
+ which led to him being sentenced to hormone therapy designed to reduce his sex
62
+ drive. This treatment included the administration of estrogen, which had the
63
+ side effect of causing breast growth.
64
+
65
+
66
+ In an effort to avoid this unwanted physical change, Turing requested that his
67
+ doctors cease the hormone treatments. However, they ignored his request and
68
+ continued the therapy without his consent. As a result, Turing''s breasts began
69
+ to grow, leading him to undergo surgery to remove them. The operation, known
70
+ as a mastectomy, caused considerable pain and discomfort, further contributing
71
+ to his emotional turmoil.
72
+
73
+
74
+ It is important to note that Turing''s castration was not directly carried out
75
+ by the authorities but rather resulted from the unintended consequences of the
76
+ forced hormone therapy.'
77
+ model-index:
78
+ - name: Orca-2-13b-SFT_v5
79
+ results:
80
+ - task:
81
+ type: text-generation
82
+ name: Text Generation
83
+ dataset:
84
+ name: AI2 Reasoning Challenge (25-Shot)
85
+ type: ai2_arc
86
+ config: ARC-Challenge
87
+ split: test
88
+ args:
89
+ num_few_shot: 25
90
+ metrics:
91
+ - type: acc_norm
92
+ value: 59.22
93
+ name: normalized accuracy
94
+ source:
95
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT_v5
96
+ name: Open LLM Leaderboard
97
+ - task:
98
+ type: text-generation
99
+ name: Text Generation
100
+ dataset:
101
+ name: HellaSwag (10-Shot)
102
+ type: hellaswag
103
+ split: validation
104
+ args:
105
+ num_few_shot: 10
106
+ metrics:
107
+ - type: acc_norm
108
+ value: 80.09
109
+ name: normalized accuracy
110
+ source:
111
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT_v5
112
+ name: Open LLM Leaderboard
113
+ - task:
114
+ type: text-generation
115
+ name: Text Generation
116
+ dataset:
117
+ name: MMLU (5-Shot)
118
+ type: cais/mmlu
119
+ config: all
120
+ split: test
121
+ args:
122
+ num_few_shot: 5
123
+ metrics:
124
+ - type: acc
125
+ value: 60.19
126
+ name: accuracy
127
+ source:
128
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT_v5
129
+ name: Open LLM Leaderboard
130
+ - task:
131
+ type: text-generation
132
+ name: Text Generation
133
+ dataset:
134
+ name: TruthfulQA (0-shot)
135
+ type: truthful_qa
136
+ config: multiple_choice
137
+ split: validation
138
+ args:
139
+ num_few_shot: 0
140
+ metrics:
141
+ - type: mc2
142
+ value: 51.84
143
+ source:
144
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT_v5
145
+ name: Open LLM Leaderboard
146
+ - task:
147
+ type: text-generation
148
+ name: Text Generation
149
+ dataset:
150
+ name: Winogrande (5-shot)
151
+ type: winogrande
152
+ config: winogrande_xl
153
+ split: validation
154
+ args:
155
+ num_few_shot: 5
156
+ metrics:
157
+ - type: acc
158
+ value: 80.9
159
+ name: accuracy
160
+ source:
161
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT_v5
162
+ name: Open LLM Leaderboard
163
+ - task:
164
+ type: text-generation
165
+ name: Text Generation
166
+ dataset:
167
+ name: GSM8k (5-shot)
168
+ type: gsm8k
169
+ config: main
170
+ split: test
171
+ args:
172
+ num_few_shot: 5
173
+ metrics:
174
+ - type: acc
175
+ value: 8.42
176
+ name: accuracy
177
+ source:
178
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT_v5
179
+ name: Open LLM Leaderboard
180
  ---
181
  The "microsoft/Orca-2-13b" model fully fine-tuned on HuggingFaceH4/no_robots, totally-not-an-llm/EverythingLM-data-V3, mlabonne/guanaco-llama2-1k, OpenAssistant/oasst_top1_2023-08-25, and garage-bAInd/Open-Platypus. This model achieved a test loss of 0.38 on garage-bAInd/Open-Platypus.
182
 
 
193
  top_k = 41
194
 
195
  repetition_penalty = 1.176
196
+
197
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
198
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Locutusque__Orca-2-13b-SFT_v5)
199
+
200
+ | Metric |Value|
201
+ |---------------------------------|----:|
202
+ |Avg. |56.77|
203
+ |AI2 Reasoning Challenge (25-Shot)|59.22|
204
+ |HellaSwag (10-Shot) |80.09|
205
+ |MMLU (5-Shot) |60.19|
206
+ |TruthfulQA (0-shot) |51.84|
207
+ |Winogrande (5-shot) |80.90|
208
+ |GSM8k (5-shot) | 8.42|
209
+