OpenNLPLab commited on
Commit
d58788d
1 Parent(s): a29ea53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -13
README.md CHANGED
@@ -89,18 +89,6 @@ In the general domain, we conducted 5-shot tests on the following datasets:
89
 
90
  | Model | PS | T | BoolQ | PIQA | HS | WG | ARC-e | ARC-c | OBQA | MMLU | CMMLU | C-Eval |
91
  |-------------|------|------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
92
- | OPT | 0.35 | 0.30 | 57.74 | 64.58 | 36.69 | 52.49 | 44.02 | 23.89 | 28.20 | 26.02 | 25.34 | 25.71 |
93
- | Pythia | 0.40 | 0.30 | 60.40 | 67.08 | 40.52 | 53.59 | 51.81 | 24.15 | 29.40 | 25.99 | 25.16 | 24.81 |
94
- | BLOOM | 0.56 | 0.35 | 55.14 | 64.09 | 36.97 | 52.80 | 47.35 | 23.98 | 28.20 | 24.80 | 25.35 | 27.14 |
95
- | RWKV | 0.43 | - | - | 67.52 | 40.90 | 51.14 | 52.86 | 25.17 | 32.40 | 24.85 | - | - |
96
- | **Ours** | 0.39 | 1.0 | 62.14 | 66.70 | 46.27 | 54.46 | 55.43 | 27.99 | 32.40 | 25.90 | 25.05 | 25.24 |
97
- | GPT-Neo | 1.3 | 0.3 | 61.99 | 71.11 | 48.93 | 54.93 | 56.19 | 25.85 | 33.60 | 24.82 | 26.03 | 23.94 |
98
- | OPT | 1.3 | 0.3 | 57.77 | 71.71 | 53.70 | 59.35 | 57.24 | 29.69 | 33.20 | 24.96 | 24.97 | 25.32 |
99
- | Pythia | 1.4 | 0.3 | 60.73 | 70.67 | 47.18 | 53.51 | 56.99 | 26.88 | 31.40 | 26.55 | 25.13 | 24.25 |
100
- | BLOOM | 1.1 | 0.35 | 59.08 | 67.14 | 42.98 | 54.93 | 51.47 | 25.68 | 29.40 | 27.30 | 25.09 | 26.50 |
101
- | RWKV | 1.5 | - | - | 72.36 | 52.48 | 54.62 | 60.48 | 29.44 | 34.00 | 25.77 | - | - |
102
- | Falcon | 1.0 | 0.35 | 61.38 | 75.14 | 61.50 | 60.30 | 63.38 | 32.17 | 35.60 | 25.28 | 24.88 | 25.66 |
103
- | **Ours** | 1.0 | 1.2 | 63.27 | 72.09 | 56.49 | 60.38 | 63.68 | 35.24 | 36.60 | 27.10 | 25.88 | 26.01 |
104
  | GPT-J | 6.9 | 0.3 | 65.44 | 75.41 | 66.25 | 64.09 | 66.92 | 36.60 | 38.20 | 25.40 | 26.47 | 23.39 |
105
  | OPT | 6.7 | 0.3 | 66.18 | 76.22 | 67.21 | 65.19 | 65.66 | 34.64 | 37.20 | 24.57 | 25.36 | 25.32 |
106
  | Pythia | 6.9 | 0.3 | 63.46 | 75.14 | 63.92 | 60.77 | 67.34 | 35.41 | 37.00 | 24.64 | 25.56 | 26.40 |
@@ -116,7 +104,7 @@ In the general domain, we conducted 5-shot tests on the following datasets:
116
  | OpenLLaMAv2 | 6.7 | 1.0 | 72.20 | 78.84 | 74.51 | 65.67 | 72.39 | 41.30 | 41.00 | 41.29 | 29.58 | 30.01 |
117
  | LLaMA1 | 6.7 | 1.0 | 76.50 | 79.80 | 76.10 | 70.10 | 72.80 | 47.60 | 57.20 | 35.10 | 25.62 | 25.72 |
118
  | LLaMA2 | 6.7 | 2.0 | 77.68 | 78.07 | 76.02 | 68.98 | 76.30 | 46.33 | 44.20 | 45.30 | 32.96 | 33.20 |
119
- | **Ours** | 6.8 | 1.4 | 75.87 | 80.09 | 75.21 | 66.06 | 75.42 | 44.40 | 63.40 | 43.10 | 47.99 | 43.18 |
120
 
121
 
122
  # Inference and Deployment
 
89
 
90
  | Model | PS | T | BoolQ | PIQA | HS | WG | ARC-e | ARC-c | OBQA | MMLU | CMMLU | C-Eval |
91
  |-------------|------|------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
 
 
 
 
 
 
 
 
 
 
 
 
92
  | GPT-J | 6.9 | 0.3 | 65.44 | 75.41 | 66.25 | 64.09 | 66.92 | 36.60 | 38.20 | 25.40 | 26.47 | 23.39 |
93
  | OPT | 6.7 | 0.3 | 66.18 | 76.22 | 67.21 | 65.19 | 65.66 | 34.64 | 37.20 | 24.57 | 25.36 | 25.32 |
94
  | Pythia | 6.9 | 0.3 | 63.46 | 75.14 | 63.92 | 60.77 | 67.34 | 35.41 | 37.00 | 24.64 | 25.56 | 26.40 |
 
104
  | OpenLLaMAv2 | 6.7 | 1.0 | 72.20 | 78.84 | 74.51 | 65.67 | 72.39 | 41.30 | 41.00 | 41.29 | 29.58 | 30.01 |
105
  | LLaMA1 | 6.7 | 1.0 | 76.50 | 79.80 | 76.10 | 70.10 | 72.80 | 47.60 | 57.20 | 35.10 | 25.62 | 25.72 |
106
  | LLaMA2 | 6.7 | 2.0 | 77.68 | 78.07 | 76.02 | 68.98 | 76.30 | 46.33 | 44.20 | 45.30 | 32.96 | 33.20 |
107
+ | **Ours** | 6.8 | 1.4 | 75.11 | 85.47 | 78.61 | 66.93 | 73.11 | 52.99 | 61.60 | 44.90 | 49.32 | 45.01 |
108
 
109
 
110
  # Inference and Deployment