leaderboard-pr-bot commited on
Commit
0feafe0
1 Parent(s): 9d34981

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +120 -3
README.md CHANGED
@@ -1,14 +1,117 @@
1
  ---
2
- library_name: transformers
3
  license: apache-2.0
4
- datasets:
5
- - Locutusque/Hercules-v3.0
6
  tags:
7
  - finetune
8
  - gpt4
9
  - synthetic data
10
  - custom_code
11
  - h2oai
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  ---
13
 
14
  ![Cypher aloobun h2oai1.8B](https://i.imgur.com/2R6f4EX.jpeg)
@@ -87,3 +190,17 @@ op = model.generate(
87
  >The consequence of using the appeal to authority fallacy is that it often leads to hasty conclusions and misinformation. It can be difficult to separate fact from fiction, especially when people rely on authority figures to make decisions. As a result, individuals may end up making poor choices based on incomplete information. This can lead to unintended consequences, such as harming oneself or others.
88
  >
89
  >To avoid falling prey to the appeal to authority fallacy, it is important to seek out multiple sources of information and consider all available evidence before making a decision. This can help individuals make more informed choices and reduce the likelihood of being swayed by unsubstantiated claims.</s>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
2
  license: apache-2.0
3
+ library_name: transformers
 
4
  tags:
5
  - finetune
6
  - gpt4
7
  - synthetic data
8
  - custom_code
9
  - h2oai
10
+ datasets:
11
+ - Locutusque/Hercules-v3.0
12
+ model-index:
13
+ - name: Cypher-Mini-1.8B
14
+ results:
15
+ - task:
16
+ type: text-generation
17
+ name: Text Generation
18
+ dataset:
19
+ name: AI2 Reasoning Challenge (25-Shot)
20
+ type: ai2_arc
21
+ config: ARC-Challenge
22
+ split: test
23
+ args:
24
+ num_few_shot: 25
25
+ metrics:
26
+ - type: acc_norm
27
+ value: 39.59
28
+ name: normalized accuracy
29
+ source:
30
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/Cypher-Mini-1.8B
31
+ name: Open LLM Leaderboard
32
+ - task:
33
+ type: text-generation
34
+ name: Text Generation
35
+ dataset:
36
+ name: HellaSwag (10-Shot)
37
+ type: hellaswag
38
+ split: validation
39
+ args:
40
+ num_few_shot: 10
41
+ metrics:
42
+ - type: acc_norm
43
+ value: 67.45
44
+ name: normalized accuracy
45
+ source:
46
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/Cypher-Mini-1.8B
47
+ name: Open LLM Leaderboard
48
+ - task:
49
+ type: text-generation
50
+ name: Text Generation
51
+ dataset:
52
+ name: MMLU (5-Shot)
53
+ type: cais/mmlu
54
+ config: all
55
+ split: test
56
+ args:
57
+ num_few_shot: 5
58
+ metrics:
59
+ - type: acc
60
+ value: 31.14
61
+ name: accuracy
62
+ source:
63
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/Cypher-Mini-1.8B
64
+ name: Open LLM Leaderboard
65
+ - task:
66
+ type: text-generation
67
+ name: Text Generation
68
+ dataset:
69
+ name: TruthfulQA (0-shot)
70
+ type: truthful_qa
71
+ config: multiple_choice
72
+ split: validation
73
+ args:
74
+ num_few_shot: 0
75
+ metrics:
76
+ - type: mc2
77
+ value: 40.44
78
+ source:
79
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/Cypher-Mini-1.8B
80
+ name: Open LLM Leaderboard
81
+ - task:
82
+ type: text-generation
83
+ name: Text Generation
84
+ dataset:
85
+ name: Winogrande (5-shot)
86
+ type: winogrande
87
+ config: winogrande_xl
88
+ split: validation
89
+ args:
90
+ num_few_shot: 5
91
+ metrics:
92
+ - type: acc
93
+ value: 65.19
94
+ name: accuracy
95
+ source:
96
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/Cypher-Mini-1.8B
97
+ name: Open LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: GSM8k (5-shot)
103
+ type: gsm8k
104
+ config: main
105
+ split: test
106
+ args:
107
+ num_few_shot: 5
108
+ metrics:
109
+ - type: acc
110
+ value: 14.48
111
+ name: accuracy
112
+ source:
113
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=aloobun/Cypher-Mini-1.8B
114
+ name: Open LLM Leaderboard
115
  ---
116
 
117
  ![Cypher aloobun h2oai1.8B](https://i.imgur.com/2R6f4EX.jpeg)
 
190
  >The consequence of using the appeal to authority fallacy is that it often leads to hasty conclusions and misinformation. It can be difficult to separate fact from fiction, especially when people rely on authority figures to make decisions. As a result, individuals may end up making poor choices based on incomplete information. This can lead to unintended consequences, such as harming oneself or others.
191
  >
192
  >To avoid falling prey to the appeal to authority fallacy, it is important to seek out multiple sources of information and consider all available evidence before making a decision. This can help individuals make more informed choices and reduce the likelihood of being swayed by unsubstantiated claims.</s>
193
+
194
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
195
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_aloobun__Cypher-Mini-1.8B)
196
+
197
+ | Metric |Value|
198
+ |---------------------------------|----:|
199
+ |Avg. |43.05|
200
+ |AI2 Reasoning Challenge (25-Shot)|39.59|
201
+ |HellaSwag (10-Shot) |67.45|
202
+ |MMLU (5-Shot) |31.14|
203
+ |TruthfulQA (0-shot) |40.44|
204
+ |Winogrande (5-shot) |65.19|
205
+ |GSM8k (5-shot) |14.48|
206
+