Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ license: apache-2.0
|
|
17 |
|
18 |
**Evaluate function calling on EN benchmark**
|
19 |
|
20 |
-
Berkeley function-calling leaderboard
|
21 |
|
22 |
| Models | β Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|
23 |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
|
@@ -31,7 +31,7 @@ Berkeley function-calling leaderboard
|
|
31 |
|
32 |
**Evaluate function calling on ZHTW benchmark**
|
33 |
|
34 |
-
function-calling-leaderboard-for-zhtw
|
35 |
|
36 |
| Models | β Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|
37 |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
|
@@ -50,7 +50,7 @@ MT-Bench
|
|
50 |
|
51 |
| | Win | Tie | Lose |
|
52 |
|---|---|---|---|
|
53 |
-
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 |
|
54 |
|
55 |
|
56 |
**Evaluate instrustion following on ZHTW benchmark**
|
@@ -59,7 +59,7 @@ MT-Bench-TC
|
|
59 |
|
60 |
| | Win | Tie | Lose |
|
61 |
|---|---|---|---|
|
62 |
-
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 |
|
63 |
|
64 |
|
65 |
## π©βπ» How to use
|
|
|
17 |
|
18 |
**Evaluate function calling on EN benchmark**
|
19 |
|
20 |
+
[Berkeley function-calling leaderboard](https://gorilla.cs.berkeley.edu/blogs/8_berkeley_function_calling_leaderboard.html)
|
21 |
|
22 |
| Models | β Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|
23 |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
|
|
|
31 |
|
32 |
**Evaluate function calling on ZHTW benchmark**
|
33 |
|
34 |
+
[function-calling-leaderboard-for-zhtw](https://github.com/mtkresearch/function-calling-leaderboard-for-zhtw)
|
35 |
|
36 |
| Models | β Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|
37 |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
|
|
|
50 |
|
51 |
| | Win | Tie | Lose |
|
52 |
|---|---|---|---|
|
53 |
+
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 29 (18.1%) | 55 (34.3%) | 76 (47.5%) |
|
54 |
|
55 |
|
56 |
**Evaluate instrustion following on ZHTW benchmark**
|
|
|
59 |
|
60 |
| | Win | Tie | Lose |
|
61 |
|---|---|---|---|
|
62 |
+
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 35 (21.9%) | 73 (45.6%) | 52 (32.5%) |
|
63 |
|
64 |
|
65 |
## π©βπ» How to use
|