File size: 7,417 Bytes
40927c8 e0110ac 7c210e5 d8fd1db e0110ac e9cc961 c4eade6 e9cc961 a033654 d8fd1db 002a533 e0110ac 0aa06b3 215ecb7 64674b4 d8ed2fe 40927c8 64674b4 54384f9 d8fd1db 002a533 54384f9 0aa06b3 215ecb7 64674b4 693d098 64674b4 54384f9 4347362 d8fd1db 4347362 dfe4b31 4347362 d8fd1db 215ecb7 dfe4b31 ec897e6 d8fd1db ec897e6 d8fd1db ec897e6 568e672 ec897e6 568e672 57a1889 ec897e6 568e672 57a1889 568e672 57a1889 568e672 57a1889 568e672 57a1889 568e672 57a1889 568e672 57a1889 568e672 57a1889 568e672 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 |
---
license: apache-2.0
---
# Model Card for MediaTek Research Breeze-7B-FC-v1_0
## 🏆 Performance
| Models | #Parameters | Organization | License | 🧰 Function Calling? | 💬 Instrustion Following? |
|--------------------------------------------------------------------------------------------|-------------|------------|------------|-------------------|----------|
| [Breeze-7B-Instruct-v1_0](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0)| 7B | MediaTek Research | Apache 2.0 | ❌ | ✅ |
| [**Breeze-7B-FC-v1_0**](https://huggingface.co/MediaTek-Research/Breeze-7B-FC-v1_0) | 7B | MediaTek Research | Apache 2.0 | ✅ | ✅ |
| [Gorilla-OpenFunctions-v2](https://huggingface.co/MediaTek-Research/Breeze-7B-FC-v1_0) | 7B | Gorilla LLM | Apache 2.0 | ✅ | ❌ |
| [GPT-3.5-Turbo-0125](https://openai.com) | | OpenAI | Proprietary| ✅ | ✅ |
**Evaluate function calling on EN benchmark**
Berkeley function-calling leaderboard
| Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
| **Breeze-7B-FC-v1_0 (FC)** | 86.89 | 76.25 | 90.00 | 93.00 | 84.00 | 84.00 | 100.00 | 92.00 | 88.00 | 77.50 |
| Gorilla-OpenFunctions-v2 (FC) | 85.95 | 60.00 | 94.25 | 95.50 | 86.50 | 86.00 | 97.00 | 96.00 | 80.00 | 75.00 |
| GPT-3.5-Turbo-0125 (FC) | 72.77 | 4.58 | 87.75 | 90.50 | 88.50 | 82.50 | 91.00 | 82.00 | 78.00 | 52.50 |
![](misc/radar_chart_en.png)
**Evaluate function calling on ZHTW benchmark**
function-calling-leaderboard-for-zhtw
| Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
| **Breeze-7B-FC-v1_0 (FC)** | 78.18 | 72.50 | 82.00 | 86.00 | 76.50|67.00|88.00|88.00|80.00|60.00|
| Gorilla-OpenFunctions-v2 (FC) | 75.68 | 53.75 | 84.75 | 86.50 | 72.50 | 68.00 | 92.00 | 92.00 | 62.00 | 72.50 |
| GPT-3.5-Turbo-0125 (FC) | 66.15 | 7.50 | 83.75 | 83.50 | 73.00 | 65.50 | 88.00 | 84.00 | 72.00 | 40.00 |
![](misc/radar_chart_zhtw.png)
**Evaluate instrustion following on EN benchmark**
MT-Bench
| | Win | Tie | Lose |
|---|---|---|---|
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 27 (16.9%) | 63 (39.4%) | 70 (43.8%) |
**Evaluate instrustion following on ZHTW benchmark**
MT-Bench-TC
| | Win | Tie | Lose |
|---|---|---|---|
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 40 (25.0%) | 69 (43.1%) | 51 (31.9%) |
## 👩💻 How to use
**Dependiency**
Install `mtkresearch` package
```
git clone https://github.com/mtkresearch/mtkresearch.git
cd mtkresearch
pip install -e .
```
**Hosting by VLLM**
```python
from vllm import LLM, SamplingParams
llm = LLM(
model='MediaTek-Research/Breeze-7B-FC-v1_0',
tensor_parallel_size=num_gpu, # number of gpus
gpu_memory_utilization=0.7
)
instance_end_token_id = llm.get_tokenizer().convert_token_to_ids('<|im_end|>')
params = SamplingParams(
temperature=0.01,
top_p=0.01,
max_tokens=4096,
repetition_penalty=1.1,
stop_token_ids=[instance_end_token_id]
)
def _inference(prompt, llm, params):
return llm.generate(prompt, params)[0].outputs[0].text
```
**Instruction following**
```python
from mtkresearch.llm.prompt import MRPromptV2
sys_prompt = 'You are a helpful AI assistant built by MediaTek Research. The user you are helping speaks Traditional Chinese and comes from Taiwan.'
prompt_engine = MRPromptV2()
conversations = [
{"role": "system", "content": sys_prompt},
{"role": "user", "content": "請問什麼是深度學習?"},
]
prompt = prompt_engine.get_prompt(conversations)
output_str = _inference(prompt, llm, params)
result = prompt_engine.parse_generated_str(output_str)
print(result)
# {'role': 'assistant',
# 'content': '深度學習(Deep Learning)是一種機器學習方法,它模仿人類大腦的神經網路結構來處理複雜的數據和任務。在深度學習中,模型由多層人工神經元組成,每個神經元之間有權重連接,並通過非線性轉換進行計算。這些層與層之間的相互作用使模型能夠學習複雜的函數關係或模式,從而解決各種問題,如圖像識別、自然語言理解、語音辨識等。深度學習通常需要大量的數據和強大的計算能力,因此經常使用圖形處理器(GPU)或特殊的加速器來執行。'}
```
**Function Calling**
```python
import json
from mtkresearch.llm.prompt import MRPromptV2
functions = [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
]
def faked_get_current_weather(location, unit=None):
return {'temperature': 30}
mapping = {
'get_current_weather': faked_get_current_weather
}
prompt_engine = MRPromptV2()
# stage 1: query
conversations = [
{"role": "user", "content": "台北目前溫度是攝氏幾度?"},
]
prompt = prompt_engine.get_prompt(conversations, functions=functions)
output_str = _inference(prompt, llm, params)
result = prompt_engine.parse_generated_str(output_str)
print(result)
# {'role': 'assistant',
# 'tool_calls': [
# {'id': 'call_U9bYCBRAbF639uUqfwehwSbw', 'type': 'function',
# 'function': {'name': 'get_current_weather', 'arguments': '{"location": "台北, 台灣", "unit": "攝氏"}'}}]}
# stage 2: execute called functions
conversations.append(result)
tool_call = result['tool_calls'][0]
func_name = tool_call['function']['name']
func = mapping[func_name]
arguments = json.loads(tool_call['function']['arguments'])
called_result = func(**arguments)
# stage 3: put executed results
conversations.append(
{
'role': 'tool',
'tool_call_id': tool_call['id'],
'name': func_name,
'content': json.dumps(called_result)
}
)
prompt = prompt_engine.get_prompt(conversations, functions=functions)
output_str2 = _inference(prompt, llm, params)
result2 = prompt_engine.parse_generated_str(output_str2)
print(result2)
# {'role': 'assistant', 'content': '台北目前的溫度是攝氏30度。'}
```
|