Files changed (1) hide show
  1. README.md +120 -7
README.md CHANGED
@@ -1,16 +1,16 @@
1
  ---
 
 
 
 
 
 
2
  pipeline_tag: text-generation
3
  inference: true
4
  widget:
5
  - text: 'def print_hello_world():'
6
  example_title: Hello world
7
  group: Python
8
- datasets:
9
- - bigcode/the-stack-v2-train
10
- license: bigcode-openrail-m
11
- library_name: transformers
12
- tags:
13
- - code
14
  model-index:
15
  - name: starcoder2-3b
16
  results:
@@ -62,6 +62,106 @@ model-index:
62
  metrics:
63
  - type: edit-smiliarity
64
  value: 71.19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  ---
66
 
67
  # StarCoder2
@@ -211,4 +311,17 @@ The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can
211
  archivePrefix={arXiv},
212
  primaryClass={cs.SE}
213
  }
214
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: bigcode-openrail-m
3
+ library_name: transformers
4
+ tags:
5
+ - code
6
+ datasets:
7
+ - bigcode/the-stack-v2-train
8
  pipeline_tag: text-generation
9
  inference: true
10
  widget:
11
  - text: 'def print_hello_world():'
12
  example_title: Hello world
13
  group: Python
 
 
 
 
 
 
14
  model-index:
15
  - name: starcoder2-3b
16
  results:
 
62
  metrics:
63
  - type: edit-smiliarity
64
  value: 71.19
65
+ - task:
66
+ type: text-generation
67
+ name: Text Generation
68
+ dataset:
69
+ name: AI2 Reasoning Challenge (25-Shot)
70
+ type: ai2_arc
71
+ config: ARC-Challenge
72
+ split: test
73
+ args:
74
+ num_few_shot: 25
75
+ metrics:
76
+ - type: acc_norm
77
+ value: 34.56
78
+ name: normalized accuracy
79
+ source:
80
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bigcode/starcoder2-3b
81
+ name: Open LLM Leaderboard
82
+ - task:
83
+ type: text-generation
84
+ name: Text Generation
85
+ dataset:
86
+ name: HellaSwag (10-Shot)
87
+ type: hellaswag
88
+ split: validation
89
+ args:
90
+ num_few_shot: 10
91
+ metrics:
92
+ - type: acc_norm
93
+ value: 47.62
94
+ name: normalized accuracy
95
+ source:
96
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bigcode/starcoder2-3b
97
+ name: Open LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: MMLU (5-Shot)
103
+ type: cais/mmlu
104
+ config: all
105
+ split: test
106
+ args:
107
+ num_few_shot: 5
108
+ metrics:
109
+ - type: acc
110
+ value: 38.65
111
+ name: accuracy
112
+ source:
113
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bigcode/starcoder2-3b
114
+ name: Open LLM Leaderboard
115
+ - task:
116
+ type: text-generation
117
+ name: Text Generation
118
+ dataset:
119
+ name: TruthfulQA (0-shot)
120
+ type: truthful_qa
121
+ config: multiple_choice
122
+ split: validation
123
+ args:
124
+ num_few_shot: 0
125
+ metrics:
126
+ - type: mc2
127
+ value: 40.49
128
+ source:
129
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bigcode/starcoder2-3b
130
+ name: Open LLM Leaderboard
131
+ - task:
132
+ type: text-generation
133
+ name: Text Generation
134
+ dataset:
135
+ name: Winogrande (5-shot)
136
+ type: winogrande
137
+ config: winogrande_xl
138
+ split: validation
139
+ args:
140
+ num_few_shot: 5
141
+ metrics:
142
+ - type: acc
143
+ value: 54.54
144
+ name: accuracy
145
+ source:
146
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bigcode/starcoder2-3b
147
+ name: Open LLM Leaderboard
148
+ - task:
149
+ type: text-generation
150
+ name: Text Generation
151
+ dataset:
152
+ name: GSM8k (5-shot)
153
+ type: gsm8k
154
+ config: main
155
+ split: test
156
+ args:
157
+ num_few_shot: 5
158
+ metrics:
159
+ - type: acc
160
+ value: 19.64
161
+ name: accuracy
162
+ source:
163
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=bigcode/starcoder2-3b
164
+ name: Open LLM Leaderboard
165
  ---
166
 
167
  # StarCoder2
 
311
  archivePrefix={arXiv},
312
  primaryClass={cs.SE}
313
  }
314
+ ```
315
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
316
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_bigcode__starcoder2-3b)
317
+
318
+ | Metric |Value|
319
+ |---------------------------------|----:|
320
+ |Avg. |39.25|
321
+ |AI2 Reasoning Challenge (25-Shot)|34.56|
322
+ |HellaSwag (10-Shot) |47.62|
323
+ |MMLU (5-Shot) |38.65|
324
+ |TruthfulQA (0-shot) |40.49|
325
+ |Winogrande (5-shot) |54.54|
326
+ |GSM8k (5-shot) |19.64|
327
+