Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,10 @@ GodziLLa 2 70B is an experimental combination of various proprietary LoRAs from
|
|
27 |
| Winogrande (5-shot) | 83.19 |
|
28 |
| GSM8K (5-shot) | 43.21 |
|
29 |
| DROP (3-shot) | 52.31 |
|
30 |
-
| Average
|
|
|
|
|
|
|
31 |
|
32 |
According to the leaderboard description, here are the benchmarks used for the evaluation:
|
33 |
- [MMLU](https://arxiv.org/abs/2009.03300) (5-shot) - a test to measure a text model’s multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more.
|
|
|
27 |
| Winogrande (5-shot) | 83.19 |
|
28 |
| GSM8K (5-shot) | 43.21 |
|
29 |
| DROP (3-shot) | 52.31 |
|
30 |
+
| Average (w/ DROP) | 67.01 |
|
31 |
+
| Average (w/o DROP) | 69.46 |
|
32 |
+
|
33 |
+
Note: As of December 1, 2023, [DROP](https://arxiv.org/abs/1903.00161) is removed from the leaderboard benchmarks.
|
34 |
|
35 |
According to the leaderboard description, here are the benchmarks used for the evaluation:
|
36 |
- [MMLU](https://arxiv.org/abs/2009.03300) (5-shot) - a test to measure a text model’s multitask accuracy. The test covers 57 tasks including elementary mathematics, US history, computer science, law, and more.
|