File size: 6,096 Bytes
bbe24b3
 
 
 
 
c1b0915
 
bbe24b3
b543071
c1b0915
 
b543071
bbe24b3
 
 
 
715653b
e23f2d7
 
f470d9d
 
 
 
e8b2071
 
f470d9d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
datasets:
- cerebras/SlimPajama-627B
language:
- en
tags:
- llama
---

200m-ish parameter model (I think the param count in the graphic here is wrong, but the bench values are correct) with the token embedding and language modelling head of Llama2-70b attached, with linear transformations from Llama2-70b's 8192d space down to this model's 1024d space.
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6079949388160e14e4e2e499/PhqViTuOrE7s65WyVRpNX.png)

|    Tasks    |Version|Filter|n-shot| Metric |Value |   |Stderr|
|-------------|-------|------|-----:|--------|-----:|---|-----:|
|arc_challenge|Yaml   |none  |    25|acc     |0.1775|±  |0.0112|
|             |       |none  |    25|acc_norm|0.2133|±  |0.0120|
|truthfulqa_mc2|Yaml   |none  |     0|acc   |0.4457|±  |0.0152|
|winogrande|Yaml   |none  |     5|acc   |0.5154|±  | 0.014|
|hellaswag|Yaml   |none  |    10|acc     |0.2832|±  |0.0045|
|         |       |none  |    10|acc_norm|0.3024|±  |0.0046|

### MMLU

(avg accuracy: 26.17%)

|               Tasks               |Version|Filter|n-shot|Metric|Value |   |Stderr|
|-----------------------------------|-------|------|-----:|------|-----:|---|-----:|
|abstract_algebra                   |Yaml   |none  |     5|acc   |0.2200|±  |0.0416|
|anatomy                            |Yaml   |none  |     5|acc   |0.2222|±  |0.0359|
|astronomy                          |Yaml   |none  |     5|acc   |0.1776|±  |0.0311|
|business_ethics                    |Yaml   |none  |     5|acc   |0.2300|±  |0.0423|
|clinical_knowledge                 |Yaml   |none  |     5|acc   |0.2415|±  |0.0263|
|college_biology                    |Yaml   |none  |     5|acc   |0.3194|±  |0.0390|
|college_chemistry                  |Yaml   |none  |     5|acc   |0.2000|±  |0.0402|
|college_computer_science           |Yaml   |none  |     5|acc   |0.2800|±  |0.0451|
|college_mathematics                |Yaml   |none  |     5|acc   |0.2800|±  |0.0451|
|college_medicine                   |Yaml   |none  |     5|acc   |0.2254|±  |0.0319|
|college_physics                    |Yaml   |none  |     5|acc   |0.2157|±  |0.0409|
|computer_security                  |Yaml   |none  |     5|acc   |0.2200|±  |0.0416|
|conceptual_physics                 |Yaml   |none  |     5|acc   |0.2553|±  |0.0285|
|econometrics                       |Yaml   |none  |     5|acc   |0.2368|±  |0.0400|
|electrical_engineering             |Yaml   |none  |     5|acc   |0.2345|±  |0.0353|
|elementary_mathematics             |Yaml   |none  |     5|acc   |0.2646|±  |0.0227|
|formal_logic                       |Yaml   |none  |     5|acc   |0.2302|±  |0.0376|
|global_facts                       |Yaml   |none  |     5|acc   |0.1700|±  |0.0378|
|high_school_biology                |Yaml   |none  |     5|acc   |0.2903|±  |0.0258|
|high_school_chemistry              |Yaml   |none  |     5|acc   |0.2611|±  |0.0309|
|high_school_computer_science       |Yaml   |none  |     5|acc   |0.2300|±  |0.0423|
|high_school_european_history       |Yaml   |none  |     5|acc   |0.2788|±  |0.0350|
|high_school_geography              |Yaml   |none  |     5|acc   |0.3081|±  |0.0329|
|high_school_government_and_politics|Yaml   |none  |     5|acc   |0.3731|±  |0.0349|
|high_school_macroeconomics         |Yaml   |none  |     5|acc   |0.2923|±  |0.0231|
|high_school_mathematics            |Yaml   |none  |     5|acc   |0.2630|±  |0.0268|
|high_school_microeconomics         |Yaml   |none  |     5|acc   |0.3403|±  |0.0308|
|high_school_physics                |Yaml   |none  |     5|acc   |0.2715|±  |0.0363|
|high_school_psychology             |Yaml   |none  |     5|acc   |0.2881|±  |0.0194|
|high_school_statistics             |Yaml   |none  |     5|acc   |0.4722|±  |0.0340|
|high_school_us_history             |Yaml   |none  |     5|acc   |0.3529|±  |0.0335|
|high_school_world_history          |Yaml   |none  |     5|acc   |0.2532|±  |0.0283|
|human_aging                        |Yaml   |none  |     5|acc   |0.2108|±  |0.0274|
|human_sexuality                    |Yaml   |none  |     5|acc   |0.2672|±  |0.0388|
|international_law                  |Yaml   |none  |     5|acc   |0.2479|±  |0.0394|
|jurisprudence                      |Yaml   |none  |     5|acc   |0.2500|±  |0.0419|
|logical_fallacies                  |Yaml   |none  |     5|acc   |0.2393|±  |0.0335|
|machine_learning                   |Yaml   |none  |     5|acc   |0.2946|±  |0.0433|
|management                         |Yaml   |none  |     5|acc   |0.1650|±  |0.0368|
|marketing                          |Yaml   |none  |     5|acc   |0.1923|±  |0.0258|
|medical_genetics                   |Yaml   |none  |     5|acc   |0.3000|±  |0.0461|
|miscellaneous                      |Yaml   |none  |     5|acc   |0.2720|±  |0.0159|
|moral_disputes                     |Yaml   |none  |     5|acc   |0.1936|±  |0.0213|
|moral_scenarios                    |Yaml   |none  |     5|acc   |0.2380|±  |0.0142|
|nutrition                          |Yaml   |none  |     5|acc   |0.2484|±  |0.0247|
|philosophy                         |Yaml   |none  |     5|acc   |0.2283|±  |0.0238|
|prehistory                         |Yaml   |none  |     5|acc   |0.2346|±  |0.0236|
|professional_accounting            |Yaml   |none  |     5|acc   |0.2589|±  |0.0261|
|professional_law                   |Yaml   |none  |     5|acc   |0.2445|±  |0.0110|
|professional_medicine              |Yaml   |none  |     5|acc   |0.4485|±  |0.0302|
|professional_psychology            |Yaml   |none  |     5|acc   |0.2614|±  |0.0178|
|public_relations                   |Yaml   |none  |     5|acc   |0.2364|±  |0.0407|
|security_studies                   |Yaml   |none  |     5|acc   |0.4000|±  |0.0314|
|sociology                          |Yaml   |none  |     5|acc   |0.3035|±  |0.0325|
|us_foreign_policy                  |Yaml   |none  |     5|acc   |0.2800|±  |0.0451|
|virology                           |Yaml   |none  |     5|acc   |0.2048|±  |0.0314|
|world_religions                    |Yaml   |none  |     5|acc   |0.1988|±  |0.0306|