thomas-yanxin
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,8 +1,30 @@
|
|
1 |
---
|
2 |
-
library_name: transformers
|
3 |
license: other
|
4 |
-
|
5 |
-
|
6 |
-
-
|
7 |
-
|
|
|
8 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
|
|
2 |
license: other
|
3 |
+
language:
|
4 |
+
- zh
|
5 |
+
- en
|
6 |
+
datasets:
|
7 |
+
- thomas-yanxin/MT-SFT-ShareGPT
|
8 |
---
|
9 |
+
|
10 |
+
|
11 |
+
The main purpose of this model is to validate the usability of [thomas-yanxin/MT-SFT-ShareGPT](https://huggingface.co/datasets/thomas-yanxin/MT-SFT-ShareGPT), i.e., the quality of the data is all you need. We found that when we meticulously extract the data through a better data governance approach, the corresponding model results can be vastly improved, even if only through SFT.
|
12 |
+
|
13 |
+
Here are the results from our OpenCompass evaluation:
|
14 |
+
|
15 |
+
| Classification | Benchmarks | Models |
|
16 |
+
| :------------: | :--------: | :--------: |
|
17 |
+
| | 名称 | XinYuan-Qwen2-7B |
|
18 |
+
| English | MMLU | 73.72 |
|
19 |
+
| | MMLU-Pro | / |
|
20 |
+
| | Theorem QA | / |
|
21 |
+
| | GPQA | / |
|
22 |
+
| | BBH | 67.55 |
|
23 |
+
| | IFEval (Prompt Strict-Acc.) | / |
|
24 |
+
| | ARC-C | 91.19 |
|
25 |
+
| Math | GSM8K | 82.94 |
|
26 |
+
| | MATH | 41.06 |
|
27 |
+
| Chinese | C-EVAL | 81.02 |
|
28 |
+
| | CMMLU | 80.06 |
|
29 |
+
| Code | MBPP | 50.6 |
|
30 |
+
| | HumanEval | 83.99 |
|