Commit
19561b0
1 Parent(s): 8eb300d

Adding Evaluation Results (#1)

Browse files

- Adding Evaluation Results (269fe96ca7b06676c4787b14fa0023a7e65f28b5)


Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +158 -46
README.md CHANGED
@@ -1,58 +1,157 @@
1
  ---
2
- base_model: vihangd/shearedplats-2.7b-v2
 
 
 
 
 
3
  datasets:
4
  - mwitiderrick/OpenPlatypus
 
5
  inference: true
6
  model_type: llama
7
- prompt_template: |
8
- ### Instruction:\n
9
  {prompt}
 
10
  ### Response:
 
 
11
  created_by: mwitiderrick
12
- tags:
13
- - transformers
14
- license: apache-2.0
15
- language:
16
- - en
17
- library_name: transformers
18
  pipeline_tag: text-generation
19
-
20
  model-index:
21
- - name: mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
22
- results:
23
- - task:
24
- type: text-generation
25
- dataset:
26
- name: hellaswag
27
- type: hellaswag
28
- metrics:
29
- - name: hellaswag(0-Shot)
30
- type: hellaswag (0-Shot)
31
- value: 0.5283
32
- - task:
33
- type: text-generation
34
- dataset:
35
- name: winogrande
36
- type: winogrande
37
- metrics:
38
- - name: winogrande(0-Shot)
39
- type: winogrande (0-Shot)
40
- value: 0.6464
41
-
42
- - task:
43
- type: text-generation
44
- dataset:
45
- name: arc_challenge
46
- type: arc_challenge
47
- metrics:
48
- - name: arc_challenge(0-Shot)
49
- type: arc_challenge (0-Shot)
50
- value: 0.3652
51
- source:
52
- name: shearedplats-2.7b-v2-instruct-v0.1 model card
53
- url: https://huggingface.co/mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
54
-
55
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
  ---
57
  # ShearedPlats-7b Instruct
58
 
@@ -138,4 +237,17 @@ Enjoy your sweet chicken buggers!
138
  |-------------|-------|------|-----:|--------|-----:|---|-----:|
139
  |arc_challenge|Yaml |none | 0|acc |0.3652|± |0.0141|
140
  | | |none | 0|acc_norm|0.3908|± |0.0143|
141
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - transformers
8
  datasets:
9
  - mwitiderrick/OpenPlatypus
10
+ base_model: vihangd/shearedplats-2.7b-v2
11
  inference: true
12
  model_type: llama
13
+ prompt_template: '### Instruction:\n
14
+
15
  {prompt}
16
+
17
  ### Response:
18
+
19
+ '
20
  created_by: mwitiderrick
 
 
 
 
 
 
21
  pipeline_tag: text-generation
 
22
  model-index:
23
+ - name: mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
24
+ results:
25
+ - task:
26
+ type: text-generation
27
+ dataset:
28
+ name: hellaswag
29
+ type: hellaswag
30
+ metrics:
31
+ - type: hellaswag (0-Shot)
32
+ value: 0.5283
33
+ name: hellaswag(0-Shot)
34
+ - task:
35
+ type: text-generation
36
+ dataset:
37
+ name: winogrande
38
+ type: winogrande
39
+ metrics:
40
+ - type: winogrande (0-Shot)
41
+ value: 0.6464
42
+ name: winogrande(0-Shot)
43
+ - task:
44
+ type: text-generation
45
+ dataset:
46
+ name: arc_challenge
47
+ type: arc_challenge
48
+ metrics:
49
+ - type: arc_challenge (0-Shot)
50
+ value: 0.3652
51
+ name: arc_challenge(0-Shot)
52
+ source:
53
+ url: https://huggingface.co/mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
54
+ name: shearedplats-2.7b-v2-instruct-v0.1 model card
55
+ - task:
56
+ type: text-generation
57
+ name: Text Generation
58
+ dataset:
59
+ name: AI2 Reasoning Challenge (25-Shot)
60
+ type: ai2_arc
61
+ config: ARC-Challenge
62
+ split: test
63
+ args:
64
+ num_few_shot: 25
65
+ metrics:
66
+ - type: acc_norm
67
+ value: 40.19
68
+ name: normalized accuracy
69
+ source:
70
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
71
+ name: Open LLM Leaderboard
72
+ - task:
73
+ type: text-generation
74
+ name: Text Generation
75
+ dataset:
76
+ name: HellaSwag (10-Shot)
77
+ type: hellaswag
78
+ split: validation
79
+ args:
80
+ num_few_shot: 10
81
+ metrics:
82
+ - type: acc_norm
83
+ value: 70.08
84
+ name: normalized accuracy
85
+ source:
86
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
87
+ name: Open LLM Leaderboard
88
+ - task:
89
+ type: text-generation
90
+ name: Text Generation
91
+ dataset:
92
+ name: MMLU (5-Shot)
93
+ type: cais/mmlu
94
+ config: all
95
+ split: test
96
+ args:
97
+ num_few_shot: 5
98
+ metrics:
99
+ - type: acc
100
+ value: 28.12
101
+ name: accuracy
102
+ source:
103
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
104
+ name: Open LLM Leaderboard
105
+ - task:
106
+ type: text-generation
107
+ name: Text Generation
108
+ dataset:
109
+ name: TruthfulQA (0-shot)
110
+ type: truthful_qa
111
+ config: multiple_choice
112
+ split: validation
113
+ args:
114
+ num_few_shot: 0
115
+ metrics:
116
+ - type: mc2
117
+ value: 41.23
118
+ source:
119
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
120
+ name: Open LLM Leaderboard
121
+ - task:
122
+ type: text-generation
123
+ name: Text Generation
124
+ dataset:
125
+ name: Winogrande (5-shot)
126
+ type: winogrande
127
+ config: winogrande_xl
128
+ split: validation
129
+ args:
130
+ num_few_shot: 5
131
+ metrics:
132
+ - type: acc
133
+ value: 65.04
134
+ name: accuracy
135
+ source:
136
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
137
+ name: Open LLM Leaderboard
138
+ - task:
139
+ type: text-generation
140
+ name: Text Generation
141
+ dataset:
142
+ name: GSM8k (5-shot)
143
+ type: gsm8k
144
+ config: main
145
+ split: test
146
+ args:
147
+ num_few_shot: 5
148
+ metrics:
149
+ - type: acc
150
+ value: 2.12
151
+ name: accuracy
152
+ source:
153
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=mwitiderrick/shearedplats-2.7b-v2-instruct-v0.1
154
+ name: Open LLM Leaderboard
155
  ---
156
  # ShearedPlats-7b Instruct
157
 
 
237
  |-------------|-------|------|-----:|--------|-----:|---|-----:|
238
  |arc_challenge|Yaml |none | 0|acc |0.3652|± |0.0141|
239
  | | |none | 0|acc_norm|0.3908|± |0.0143|
240
+ ```
241
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
242
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_mwitiderrick__shearedplats-2.7b-v2-instruct-v0.1)
243
+
244
+ | Metric |Value|
245
+ |---------------------------------|----:|
246
+ |Avg. |41.13|
247
+ |AI2 Reasoning Challenge (25-Shot)|40.19|
248
+ |HellaSwag (10-Shot) |70.08|
249
+ |MMLU (5-Shot) |28.12|
250
+ |TruthfulQA (0-shot) |41.23|
251
+ |Winogrande (5-shot) |65.04|
252
+ |GSM8k (5-shot) | 2.12|
253
+