Update README.md
Browse files
README.md
CHANGED
@@ -9,37 +9,39 @@ Please refer to our [Github page](https://github.com/GreenBitAI/low_bit_llama) f
|
|
9 |
## Model Description
|
10 |
|
11 |
- **Developed by:** [GreenBitAI](https://github.com/GreenBitAI)
|
12 |
-
- **
|
|
|
13 |
- **Language(s) (NLP):** English
|
14 |
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
|
15 |
|
16 |
-
|
17 |
-
|
|
18 |
-
|
|
|
|
19 |
| GroupSize | - | 32 | 8 | - | 32 | 8 |
|
20 |
-
| Model Size (GB) | 68.79 | 19.89 |
|
21 |
-
| AVG | 70.64 | 69.7 |
|
22 |
| Detailed Evaluation | | | | | | |
|
23 |
-
| MMLU | 76.32 | 75.42 |
|
24 |
-
| CMMLU | 83.65 | 83.07 |
|
25 |
-
| ARC-e | 84.42 | 84.13 |
|
26 |
-
| ARC-c | 61.77 | 59.56 |
|
27 |
-
| GAOKAO | 82.8 | 81.37 |
|
28 |
-
| GSM8K | 67.24 | 63.61 |
|
29 |
-
| HumanEval | 25.6 | 25 |
|
30 |
-
| BBH | 54.3 | 52.3 |
|
31 |
-
| WinoGrande | 78.68 | 78.53 |
|
32 |
-
| PIQA | 82.86 | 82.75 |
|
33 |
-
| SIQA | 74.46 | 73.44 |
|
34 |
-
| HellaSwag | 83.64 | 83.02 |
|
35 |
-
| OBQA | 91.6 | 90.8 |
|
36 |
-
| CSQA | 83.37 | 83.05 |
|
37 |
-
| TriviaQA | 81.52 | 80.73 |
|
38 |
-
| SquAD | 92.46 | 91.12 |
|
39 |
-
| BoolQ | 88.25 | 88.17 |
|
40 |
-
| MBPP | 41 | 39.68 |
|
41 |
-
| QUAC | 48.61 | 47.43 |
|
42 |
-
| Lambda | 73.18 | 73.39 |
|
43 |
-
| NaturalQuestion | 27.67 | 27.21 |
|
44 |
|
45 |
|
|
|
9 |
## Model Description
|
10 |
|
11 |
- **Developed by:** [GreenBitAI](https://github.com/GreenBitAI)
|
12 |
+
- **Evaluated By:** 01-Yi official
|
13 |
+
- **Model type:** Causal (Llama 2/Yi 6B)
|
14 |
- **Language(s) (NLP):** English
|
15 |
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
|
16 |
|
17 |
+
## Few Shot Evaluation
|
18 |
+
| Model | Yi-34B | | Yi-6B | |
|
19 |
+
| --- | --- | --- | --- | --- |
|
20 |
+
| Bit | 16 | 4 | 16 | 4 |
|
21 |
| GroupSize | - | 32 | 8 | - | 32 | 8 |
|
22 |
+
| Model Size (GB) | 68.79 | 19.89 | 12.12 | 4.04 |
|
23 |
+
| AVG | 70.64 | 69.7 | 60.11 | 59.14 |
|
24 |
| Detailed Evaluation | | | | | | |
|
25 |
+
| MMLU | 76.32 | 75.42 | 63.24 | 62.09 |
|
26 |
+
| CMMLU | 83.65 | 83.07 | 75.53 | 72.85 |
|
27 |
+
| ARC-e | 84.42 | 84.13 | 77.23 | 76.52 |
|
28 |
+
| ARC-c | 61.77 | 59.56 | 50.34 | 48.47 |
|
29 |
+
| GAOKAO | 82.8 | 81.37 | 72.2 | 72.87 |
|
30 |
+
| GSM8K | 67.24 | 63.61 | 32.52 | 28.05 |
|
31 |
+
| HumanEval | 25.6 | 25 | 15.85 | 15.85 |
|
32 |
+
| BBH | 54.3 | 52.3 | 42.8 | 41.47 |
|
33 |
+
| WinoGrande | 78.68 | 78.53 | 70.63 | 71.19 |
|
34 |
+
| PIQA | 82.86 | 82.75 | 78.56 | 79.05 |
|
35 |
+
| SIQA | 74.46 | 73.44 | 64.53 | 64.53 |
|
36 |
+
| HellaSwag | 83.64 | 83.02 | 74.91 | 73.27 |
|
37 |
+
| OBQA | 91.6 | 90.8 | 85.4 | 82.6 |
|
38 |
+
| CSQA | 83.37 | 83.05 | 76.9 | 75.43 |
|
39 |
+
| TriviaQA | 81.52 | 80.73 | 64.85 | 61.75 |
|
40 |
+
| SquAD | 92.46 | 91.12 | 88.95 | 88.39 |
|
41 |
+
| BoolQ | 88.25 | 88.17 | 76.23 | 77.1 |
|
42 |
+
| MBPP | 41 | 39.68 | 26.32 | 25.13 |
|
43 |
+
| QUAC | 48.61 | 47.43 | 40.92 | 40.16 |
|
44 |
+
| Lambda | 73.18 | 73.39 | 67.74 | 67.8 |
|
45 |
+
| NaturalQuestion | 27.67 | 27.21 | 16.69 | 17.42 |
|
46 |
|
47 |
|