jaspercatapang commited on
Commit
78102fc
1 Parent(s): a7e4cce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -27,7 +27,10 @@ GodziLLa 2 70B is an experimental combination of various proprietary LoRAs from
27
  | Winogrande (5-shot) | 83.19 |
28
  | GSM8K (5-shot) | 43.21 |
29
  | DROP (3-shot) | 52.31 |
30
- | Average | 67.01 |
 
 
 
31
 
32
  According to the leaderboard description, here are the benchmarks used for the evaluation:
33
  - [MMLU](https://arxiv.org/abs/2009.03300) (5-shot) - a test to measure a text model’s multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more.
 
27
  | Winogrande (5-shot) | 83.19 |
28
  | GSM8K (5-shot) | 43.21 |
29
  | DROP (3-shot) | 52.31 |
30
+ | Average (w/ DROP) | 67.01 |
31
+ | Average (w/o DROP) | 69.46 |
32
+
33
+ Note: As of December 1, 2023, [DROP](https://arxiv.org/abs/1903.00161) is removed from the leaderboard benchmarks.
34
 
35
  According to the leaderboard description, here are the benchmarks used for the evaluation:
36
  - [MMLU](https://arxiv.org/abs/2009.03300) (5-shot) - a test to measure a text model’s multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more.