Update README.md
Browse files
README.md
CHANGED
@@ -1,11 +1,11 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
language:
|
4 |
-
- en
|
5 |
-
- zh
|
6 |
-
tags:
|
7 |
-
- moe
|
8 |
-
---
|
9 |
# AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies
|
10 |
<p align="center">
|
11 |
<br>
|
@@ -142,6 +142,10 @@ The performance of the AquilaMoE model series improves significantly across mult
|
|
142 |
| mmlu-ppl | 59.93 |
|
143 |
| winograd-ppl | 57.5 |
|
144 |
|
|
|
|
|
|
|
|
|
145 |
*Table: Performance of AquilaMoE-SFT (16\*8B) on various benchmarks.*
|
146 |
|
147 |
## License Agreement
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
- zh
|
6 |
+
tags:
|
7 |
+
- moe
|
8 |
+
---
|
9 |
# AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies
|
10 |
<p align="center">
|
11 |
<br>
|
|
|
142 |
| mmlu-ppl | 59.93 |
|
143 |
| winograd-ppl | 57.5 |
|
144 |
|
145 |
+
| Model | GPT 3.5 Turbo (11/06) | GPT 3.5 Turbo (03/01) | AquilaMoE-SFT |
|
146 |
+
|------------------|-----------------------|-----------------------|---------------|
|
147 |
+
| AlpacaEval 2.0 | 19.3 | 18.1 | 21.1 |
|
148 |
+
|
149 |
*Table: Performance of AquilaMoE-SFT (16\*8B) on various benchmarks.*
|
150 |
|
151 |
## License Agreement
|