Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ During the alignment phase, we initially trained our model using 1 million sampl
|
|
27 |
|
28 |
We have evaluated Nanbeige2-8B-Chat's general question-answering capabilities and human preference alignments on several popular benchmark datasets. The model has achieved notable results in single-turn English QA ([AlpacaEval 2.0](https://tatsu-lab.github.io/alpaca_eval/)), single-turn Chinese QA ([AlignBench](https://github.com/THUDM/AlignBench)), and multi-turn English QA ([MT-Bench](https://arxiv.org/abs/2306.05685)).
|
29 |
|
30 |
-
| AlpacaEval 2.0 | AlignBench | MT-Bench |
|
31 |
|:--------------:|:----------:|:--------:|
|
32 |
| 43.0%/40.4% | 7.62 | 8.60 |
|
33 |
|
|
|
27 |
|
28 |
We have evaluated Nanbeige2-8B-Chat's general question-answering capabilities and human preference alignments on several popular benchmark datasets. The model has achieved notable results in single-turn English QA ([AlpacaEval 2.0](https://tatsu-lab.github.io/alpaca_eval/)), single-turn Chinese QA ([AlignBench](https://github.com/THUDM/AlignBench)), and multi-turn English QA ([MT-Bench](https://arxiv.org/abs/2306.05685)).
|
29 |
|
30 |
+
| AlpacaEval 2.0(LC Win Rate/ Win Rate) | AlignBench | MT-Bench |
|
31 |
|:--------------:|:----------:|:--------:|
|
32 |
| 43.0%/40.4% | 7.62 | 8.60 |
|
33 |
|