leaderboard-pr-bot commited on
Commit
9f7c81b
1 Parent(s): f3491a4

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +178 -11
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- license: other
3
  language:
4
  - en
5
  - fr
@@ -7,21 +6,176 @@ language:
7
  - hi
8
  - zh
9
  - code
10
- base_model: microsoft/Orca-2-13b
11
  datasets:
12
  - HuggingFaceH4/no_robots
13
  - mlabonne/guanaco-llama2-1k
14
  - OpenAssistant/oasst_top1_2023-08-25
15
  - totally-not-an-llm/EverythingLM-data-V3
 
16
  widget:
17
- - example_title: "folklore"
18
- text: "<|im_start|>user\nAre the stories of Nahanni Valley, like the decapitated men, true?\n<|im_end|>\n<|im_start|>assistant\n"
19
- output:
20
- text: "The Nahanni Valley in Canada is a remote and rugged region that has long been shrouded in mystery. It is home to the South Nahanni River, which was designated as a UNESCO World Heritage Site due to its exceptional natural beauty and biodiversity. The valley's history includes tales of gold prospectors, fur traders, and indigenous populations, but it also features some unexplained events and legends. One such story involves two prospectors who allegedly encountered an unknown creature that led them into the forest before returning without their heads. This tale has been passed down through generations, but there are no definitive answers regarding its veracity.\n\nIt is important to note that these stories may have originated from local folklore or were exaggerated over time. There is no concrete evidence to support the existence of supernatural creatures or paranormal occurrences within the Nahanni Valley. Nonetheless, the area remains a captivating destination for those seeking adventure and exploration. If you have any questions about this topic or need assistance related to traveling to the Nahanni Valley, please feel free to ask!"
21
- - example_title: "Alan Turing"
22
- text: "<|im_start|>user\nWhy was Alan Turing castrated?\n<|im_end|>\n<|im_start|>assistant\n"
23
- output:
24
- text: "Alan Turing, a renowned mathematician and computer scientist who played a crucial role in breaking the German Enigma code during World War II, faced significant challenges due to his homosexuality. In 1952, he was convicted of \"gross indecency\" for engaging in a consensual relationship with another man, which led to him being sentenced to hormone therapy designed to reduce his sex drive. This treatment included the administration of estrogen, which had the side effect of causing breast growth.\n\nIn an effort to avoid this unwanted physical change, Turing requested that his doctors cease the hormone treatments. However, they ignored his request and continued the therapy without his consent. As a result, Turing's breasts began to grow, leading him to undergo surgery to remove them. The operation, known as a mastectomy, caused considerable pain and discomfort, further contributing to his emotional turmoil.\n\nIt is important to note that Turing's castration was not directly carried out by the authorities but rather resulted from the unintended consequences of the forced hormone therapy."
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  ---
26
  The "microsoft/Orca-2-13b" model fully fine-tuned on HuggingFaceH4/no_robots, totally-not-an-llm/EverythingLM-data-V3, mlabonne/guanaco-llama2-1k, and OpenAssistant/oasst_top1_2023-08-25. This model achieved a test loss of 0.18.
27
 
@@ -41,4 +195,17 @@ repetition_penalty = 1.176
41
 
42
  # Updates:
43
 
44
- 12/18/23 - 🔥 **This model holds the #5 position on the Open LLM Leaderboard among llama2-13b models.** 🔥
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  language:
3
  - en
4
  - fr
 
6
  - hi
7
  - zh
8
  - code
9
+ license: other
10
  datasets:
11
  - HuggingFaceH4/no_robots
12
  - mlabonne/guanaco-llama2-1k
13
  - OpenAssistant/oasst_top1_2023-08-25
14
  - totally-not-an-llm/EverythingLM-data-V3
15
+ base_model: microsoft/Orca-2-13b
16
  widget:
17
+ - example_title: folklore
18
+ text: '<|im_start|>user
19
+
20
+ Are the stories of Nahanni Valley, like the decapitated men, true?
21
+
22
+ <|im_end|>
23
+
24
+ <|im_start|>assistant
25
+
26
+ '
27
+ output:
28
+ text: 'The Nahanni Valley in Canada is a remote and rugged region that has long
29
+ been shrouded in mystery. It is home to the South Nahanni River, which was designated
30
+ as a UNESCO World Heritage Site due to its exceptional natural beauty and biodiversity.
31
+ The valley''s history includes tales of gold prospectors, fur traders, and indigenous
32
+ populations, but it also features some unexplained events and legends. One such
33
+ story involves two prospectors who allegedly encountered an unknown creature
34
+ that led them into the forest before returning without their heads. This tale
35
+ has been passed down through generations, but there are no definitive answers
36
+ regarding its veracity.
37
+
38
+
39
+ It is important to note that these stories may have originated from local folklore
40
+ or were exaggerated over time. There is no concrete evidence to support the
41
+ existence of supernatural creatures or paranormal occurrences within the Nahanni
42
+ Valley. Nonetheless, the area remains a captivating destination for those seeking
43
+ adventure and exploration. If you have any questions about this topic or need
44
+ assistance related to traveling to the Nahanni Valley, please feel free to ask!'
45
+ - example_title: Alan Turing
46
+ text: '<|im_start|>user
47
+
48
+ Why was Alan Turing castrated?
49
+
50
+ <|im_end|>
51
+
52
+ <|im_start|>assistant
53
+
54
+ '
55
+ output:
56
+ text: 'Alan Turing, a renowned mathematician and computer scientist who played
57
+ a crucial role in breaking the German Enigma code during World War II, faced
58
+ significant challenges due to his homosexuality. In 1952, he was convicted of
59
+ "gross indecency" for engaging in a consensual relationship with another man,
60
+ which led to him being sentenced to hormone therapy designed to reduce his sex
61
+ drive. This treatment included the administration of estrogen, which had the
62
+ side effect of causing breast growth.
63
+
64
+
65
+ In an effort to avoid this unwanted physical change, Turing requested that his
66
+ doctors cease the hormone treatments. However, they ignored his request and
67
+ continued the therapy without his consent. As a result, Turing''s breasts began
68
+ to grow, leading him to undergo surgery to remove them. The operation, known
69
+ as a mastectomy, caused considerable pain and discomfort, further contributing
70
+ to his emotional turmoil.
71
+
72
+
73
+ It is important to note that Turing''s castration was not directly carried out
74
+ by the authorities but rather resulted from the unintended consequences of the
75
+ forced hormone therapy.'
76
+ model-index:
77
+ - name: Orca-2-13b-SFT-v4
78
+ results:
79
+ - task:
80
+ type: text-generation
81
+ name: Text Generation
82
+ dataset:
83
+ name: AI2 Reasoning Challenge (25-Shot)
84
+ type: ai2_arc
85
+ config: ARC-Challenge
86
+ split: test
87
+ args:
88
+ num_few_shot: 25
89
+ metrics:
90
+ - type: acc_norm
91
+ value: 59.22
92
+ name: normalized accuracy
93
+ source:
94
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v4
95
+ name: Open LLM Leaderboard
96
+ - task:
97
+ type: text-generation
98
+ name: Text Generation
99
+ dataset:
100
+ name: HellaSwag (10-Shot)
101
+ type: hellaswag
102
+ split: validation
103
+ args:
104
+ num_few_shot: 10
105
+ metrics:
106
+ - type: acc_norm
107
+ value: 79.58
108
+ name: normalized accuracy
109
+ source:
110
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v4
111
+ name: Open LLM Leaderboard
112
+ - task:
113
+ type: text-generation
114
+ name: Text Generation
115
+ dataset:
116
+ name: MMLU (5-Shot)
117
+ type: cais/mmlu
118
+ config: all
119
+ split: test
120
+ args:
121
+ num_few_shot: 5
122
+ metrics:
123
+ - type: acc
124
+ value: 60.23
125
+ name: accuracy
126
+ source:
127
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v4
128
+ name: Open LLM Leaderboard
129
+ - task:
130
+ type: text-generation
131
+ name: Text Generation
132
+ dataset:
133
+ name: TruthfulQA (0-shot)
134
+ type: truthful_qa
135
+ config: multiple_choice
136
+ split: validation
137
+ args:
138
+ num_few_shot: 0
139
+ metrics:
140
+ - type: mc2
141
+ value: 51.15
142
+ source:
143
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v4
144
+ name: Open LLM Leaderboard
145
+ - task:
146
+ type: text-generation
147
+ name: Text Generation
148
+ dataset:
149
+ name: Winogrande (5-shot)
150
+ type: winogrande
151
+ config: winogrande_xl
152
+ split: validation
153
+ args:
154
+ num_few_shot: 5
155
+ metrics:
156
+ - type: acc
157
+ value: 80.03
158
+ name: accuracy
159
+ source:
160
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v4
161
+ name: Open LLM Leaderboard
162
+ - task:
163
+ type: text-generation
164
+ name: Text Generation
165
+ dataset:
166
+ name: GSM8k (5-shot)
167
+ type: gsm8k
168
+ config: main
169
+ split: test
170
+ args:
171
+ num_few_shot: 5
172
+ metrics:
173
+ - type: acc
174
+ value: 28.28
175
+ name: accuracy
176
+ source:
177
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Locutusque/Orca-2-13b-SFT-v4
178
+ name: Open LLM Leaderboard
179
  ---
180
  The "microsoft/Orca-2-13b" model fully fine-tuned on HuggingFaceH4/no_robots, totally-not-an-llm/EverythingLM-data-V3, mlabonne/guanaco-llama2-1k, and OpenAssistant/oasst_top1_2023-08-25. This model achieved a test loss of 0.18.
181
 
 
195
 
196
  # Updates:
197
 
198
+ 12/18/23 - 🔥 **This model holds the #5 position on the Open LLM Leaderboard among llama2-13b models.** 🔥
199
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
200
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Locutusque__Orca-2-13b-SFT-v4)
201
+
202
+ | Metric |Value|
203
+ |---------------------------------|----:|
204
+ |Avg. |59.75|
205
+ |AI2 Reasoning Challenge (25-Shot)|59.22|
206
+ |HellaSwag (10-Shot) |79.58|
207
+ |MMLU (5-Shot) |60.23|
208
+ |TruthfulQA (0-shot) |51.15|
209
+ |Winogrande (5-shot) |80.03|
210
+ |GSM8k (5-shot) |28.28|
211
+