angelahzyuan commited on
Commit
e75b851
1 Parent(s): 3829d99

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -12
README.md CHANGED
@@ -142,6 +142,21 @@ Results are reported by using [lm-evaluation-harness](https://github.com/Eleuthe
142
  |[Llama-3-8B-SPPO Iter2](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter2) | 64.93 | 56.48 | 76.87 | 75.13 | 80.39 | 65.67 | 69.91
143
  |[Llama-3-8B-SPPO Iter3](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3) | 65.19 | 58.04 | 77.11 | 74.91 | 80.86 | 65.60 | **70.29**
144
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
145
  ### Training hyperparameters
146
  The following hyperparameters were used during training:
147
 
@@ -171,16 +186,4 @@ The following hyperparameters were used during training:
171
  primaryClass={cs.LG}
172
  }
173
  ```
174
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
175
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_UCLA-AGI__Llama-3-Instruct-8B-SPPO-Iter3)
176
-
177
- | Metric |Value|
178
- |-------------------|----:|
179
- |Avg. |23.68|
180
- |IFEval (0-Shot) |68.28|
181
- |BBH (3-Shot) |29.74|
182
- |MATH Lvl 5 (4-Shot)| 7.33|
183
- |GPQA (0-shot) | 2.01|
184
- |MuSR (0-shot) | 3.09|
185
- |MMLU-PRO (5-shot) |29.38|
186
 
 
142
  |[Llama-3-8B-SPPO Iter2](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter2) | 64.93 | 56.48 | 76.87 | 75.13 | 80.39 | 65.67 | 69.91
143
  |[Llama-3-8B-SPPO Iter3](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3) | 65.19 | 58.04 | 77.11 | 74.91 | 80.86 | 65.60 | **70.29**
144
 
145
+
146
+ # [Open LLM Leaderboard 2 Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
147
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_UCLA-AGI__Llama-3-Instruct-8B-SPPO-Iter3)
148
+
149
+ | Metric |Value|
150
+ |-------------------|----:|
151
+ |Avg. |23.68|
152
+ |IFEval (0-Shot) |68.28|
153
+ |BBH (3-Shot) |29.74|
154
+ |MATH Lvl 5 (4-Shot)| 7.33|
155
+ |GPQA (0-shot) | 2.01|
156
+ |MuSR (0-shot) | 3.09|
157
+ |MMLU-PRO (5-shot) |29.38|
158
+
159
+
160
  ### Training hyperparameters
161
  The following hyperparameters were used during training:
162
 
 
186
  primaryClass={cs.LG}
187
  }
188
  ```
 
 
 
 
 
 
 
 
 
 
 
 
189