Update README.md
Browse files
README.md
CHANGED
@@ -21,12 +21,21 @@ Nxcode-CQ-7B-orpo is an ORPO fine-tune of Qwen/CodeQwen1.5-7B-Chat on 100k sampl
|
|
21 |
* Supporting 92 coding languages
|
22 |
* Excellent performance in text-to-SQL, bug fix, etc.
|
23 |
|
24 |
-
## Evalplus(https://github.com/evalplus/evalplus)
|
25 |
|
26 |
-
|
|
27 |
| --- | --- |
|
28 |
-
|
|
29 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
30 |
|
31 |
## Quickstart
|
32 |
|
|
|
21 |
* Supporting 92 coding languages
|
22 |
* Excellent performance in text-to-SQL, bug fix, etc.
|
23 |
|
24 |
+
## [Evalplus](https://github.com/evalplus/evalplus)
|
25 |
|
26 |
+
| EvalPlus | pass@1 |
|
27 |
| --- | --- |
|
28 |
+
| HumanEval | 86.0 |
|
29 |
+
| HumanVval+ | 81.1 |
|
30 |
+
|
31 |
+
[Evalplus Leaderboard](https://evalplus.github.io/leaderboard.html)
|
32 |
+
| Models | HumanEval | HumanEval+|
|
33 |
+
|------ | ------ | ------ |
|
34 |
+
| GPT-4-Turbo (April 2024)| 90.2| 86.6|
|
35 |
+
| GPT-4 (May 2023)| 88.4| 81.17|
|
36 |
+
| GPT-4-Turbo (Nov 2023)| 85.4| 79.3|
|
37 |
+
| CodeQwen1.5-7B-Chat| 83.5| 78.7|
|
38 |
+
| claude-3-opus (Mar 2024)| 82.9| 76.8|
|
39 |
|
40 |
## Quickstart
|
41 |
|