NicoNico commited on
Commit
a42dd8a
1 Parent(s): 341bdd5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -29
README.md CHANGED
@@ -14,34 +14,59 @@ Please refer to our [Github page](https://github.com/GreenBitAI/low_bit_llama) f
14
  - **Language(s) (NLP):** English
15
  - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
16
 
17
- ## Few Shot Evaluation
18
- | Model | Yi-34B | | Yi-6B | |
19
- | --- | --- | --- | --- | --- |
20
- | Bit | 16 | 4 | 16 | 4 |
21
- | GroupSize | - | 32 | 8 | - | 32 | 8 |
22
- | Model Size (GB) | 68.79 | 19.89 | 12.12 | 4.04 |
23
- | AVG | 70.64 | 69.7 | 60.11 | 59.14 |
24
- | Detailed Evaluation | | | | | | |
25
- | MMLU | 76.32 | 75.42 | 63.24 | 62.09 |
26
- | CMMLU | 83.65 | 83.07 | 75.53 | 72.85 |
27
- | ARC-e | 84.42 | 84.13 | 77.23 | 76.52 |
28
- | ARC-c | 61.77 | 59.56 | 50.34 | 48.47 |
29
- | GAOKAO | 82.8 | 81.37 | 72.2 | 72.87 |
30
- | GSM8K | 67.24 | 63.61 | 32.52 | 28.05 |
31
- | HumanEval | 25.6 | 25 | 15.85 | 15.85 |
32
- | BBH | 54.3 | 52.3 | 42.8 | 41.47 |
33
- | WinoGrande | 78.68 | 78.53 | 70.63 | 71.19 |
34
- | PIQA | 82.86 | 82.75 | 78.56 | 79.05 |
35
- | SIQA | 74.46 | 73.44 | 64.53 | 64.53 |
36
- | HellaSwag | 83.64 | 83.02 | 74.91 | 73.27 |
37
- | OBQA | 91.6 | 90.8 | 85.4 | 82.6 |
38
- | CSQA | 83.37 | 83.05 | 76.9 | 75.43 |
39
- | TriviaQA | 81.52 | 80.73 | 64.85 | 61.75 |
40
- | SquAD | 92.46 | 91.12 | 88.95 | 88.39 |
41
- | BoolQ | 88.25 | 88.17 | 76.23 | 77.1 |
42
- | MBPP | 41 | 39.68 | 26.32 | 25.13 |
43
- | QUAC | 48.61 | 47.43 | 40.92 | 40.16 |
44
- | Lambda | 73.18 | 73.39 | 67.74 | 67.8 |
45
- | NaturalQuestion | 27.67 | 27.21 | 16.69 | 17.42 |
46
 
47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  - **Language(s) (NLP):** English
15
  - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)
16
 
17
+ ## Few Shot Evaluation (officially evaluated by 01-Yi)
18
+ | Model | Yi-34B FP16| [Yi-34B 4 bit](https://huggingface.co/GreenBitAI/yi-34b-w4a16g32) | Yi-6B FP16 | [Yi-6B 4 bit](https://huggingface.co/GreenBitAI/yi-6b-w4a16g32) |
19
+ |----------------|-----------|----------|----------|---------|
20
+ | GroupSize | - | 32 | - | 8 |
21
+ | Model Size (GB)| 68.79 | 19.89 | 12.12 | 4.04 |
22
+ | AVG | 70.64 | 69.7 | 60.11 | 59.14 |
23
+ | **Detailed Evaluation** | | | | |
24
+ | MMLU | 76.32 | 75.42 | 63.24 | 62.09 |
25
+ | CMMLU | 83.65 | 83.07 | 75.53 | 72.85 |
26
+ | ARC-e | 84.42 | 84.13 | 77.23 | 76.52 |
27
+ | ARC-c | 61.77 | 59.56 | 50.34 | 48.47 |
28
+ | GAOKAO | 82.8 | 81.37 | 72.2 | 72.87 |
29
+ | GSM8K | 67.24 | 63.61 | 32.52 | 28.05 |
30
+ | HumanEval | 25.6 | 25 | 15.85 | 15.85 |
31
+ | BBH | 54.3 | 52.3 | 42.8 | 41.47 |
32
+ | WinoGrande | 78.68 | 78.53 | 70.63 | 71.19 |
33
+ | PIQA | 82.86 | 82.75 | 78.56 | 79.05 |
34
+ | SIQA | 74.46 | 73.44 | 64.53 | 64.53 |
35
+ | HellaSwag | 83.64 | 83.02 | 74.91 | 73.27 |
36
+ | OBQA | 91.6 | 90.8 | 85.4 | 82.6 |
37
+ | CSQA | 83.37 | 83.05 | 76.9 | 75.43 |
38
+ | TriviaQA | 81.52 | 80.73 | 64.85 | 61.75 |
39
+ | SquAD | 92.46 | 91.12 | 88.95 | 88.39 |
40
+ | BoolQ | 88.25 | 88.17 | 76.23 | 77.1 |
41
+ | MBPP | 41 | 39.68 | 26.32 | 25.13 |
42
+ | QUAC | 48.61 | 47.43 | 40.92 | 40.16 |
43
+ | Lambda | 73.18 | 73.39 | 67.74 | 67.8 |
44
+ | NaturalQuestion| 27.67 | 27.21 | 16.69 | 17.42 |
 
45
 
46
 
47
+ # Zero Shot Evaluation
48
+ | Task | Metric | Yi-6B FP16 | [Yi-6B 4 bit](https://huggingface.co/GreenBitAI/yi-6b-w4a16g32) | [Yi-34B 4 bit](https://huggingface.co/GreenBitAI/yi-34b-w4a16g32) |
49
+ |---------------|--------|---------|-------------|--------------|
50
+ | Openbookqa | acc | 0.314 | 0.324 | 0.344 |
51
+ | | ac_norm| 0.408 | 0.42 | 0.474 |
52
+ | arc_challenge | acc | 0.462 | 0.4573 | 0.569 |
53
+ | | ac_norm| 0.504 | 0.483 | 0.5964 |
54
+ | hellawswag | acc | 0.553 | 0.5447 | 0.628 |
55
+ | | ac_norm| 0.749 | 0.7327 | 0.83 |
56
+ | piqa | acc | 0.777 | 0.7709 | 0.8079 |
57
+ | | ac_norm| 0.787 | 0.7894 | 0.828 |
58
+ | arc_easy | acc | 0.777 | 0.7697 | 0.835 |
59
+ | | ac_norm| 0.774 | 0.7659 | 0.84 |
60
+ | Winogrande | acc | 0.707 | 0.7095 | 0.7853 |
61
+ | boolq | acc | 0.755 | 0.7648 | 0.886 |
62
+ | truthfulqa_mc | mc1 | 0.29 | 0.2729 | 0.4026 |
63
+ | | mc2 | 0.419 | 0.4033 | 0.5528 |
64
+ | anli_r1 | acc | 0.423 | 0.416 | 0.554 |
65
+ | anli_r2 | acc | 0.409 | 0.409 | 0.518 |
66
+ | anli_r3 | acc | 0.411 | 0.393 | 0.4983 |
67
+ | wic | acc | 0.529 | 0.545 | 0.5376 |
68
+ | rte | acc | 0.685 | 0.7039 | 0.7617 |
69
+ | record | f1 | 0.904 | 0.9011 | 0.924 |
70
+ | | em | 0.8962 | 0.8927 | 0.916 |
71
+ | Average | | 0.596 | 0.5937 | 0.6708 |
72
+