Safetensors
qwen2
qypeng commited on
Commit
37c1c7e
1 Parent(s): 52410f0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -18
README.md CHANGED
@@ -2,36 +2,33 @@
2
  license: cc-by-4.0
3
  datasets:
4
  - Salesforce/xlam-function-calling-60k
5
- - MadeAgents/XLAM-7.5k-Irrelevance
6
  base_model:
7
  - Qwen/Qwen2.5-3B-Instruct
8
  ---
9
 
10
- # Hammer2.0-3b Function Calling Model
11
-
12
  ## Introduction
13
  We're excited to release lightweight Hammer 2.0 models ([0.5B](https://huggingface.co/MadeAgents/Hammer2.0-0.5b) , [1.5B](https://huggingface.co/MadeAgents/Hammer2.0-1.5b) , [3B](https://huggingface.co/MadeAgents/Hammer2.0-3b) , and [7B](https://huggingface.co/MadeAgents/Hammer2.0-7b)) with strong function calling capability, which empower developers to build personalized, on-device agentic applications.
14
 
15
  ## Model Details
16
- Hammer2.0 finetuned based on [Qwen 2.5 series](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e) and [Qwen 2.5 coder series](https://huggingface.co/collections/Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f) using function masking techniques. It's trained using the [APIGen Function Calling Datasets](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) containing 60,000 samples, supplemented by [XLAM-7.5k-Irrelevance](https://huggingface.co/datasets/MadeAgents/XLAM-7.5k-Irrelevance) we generated. Hammer2.0 has achieved exceptional performances across numerous function calling benchmarks. For detailed data construction, training methods, and evaluation strategies, please refer to our paper [Hammer: Robust Function-Calling for On-Device Language Models via Function Masking](https://arxiv.org/abs/2410.04587) and the [Hammer GitHub repository](https://github.com/MadeAgents/Hammer) .
17
-
18
 
19
  ## Evaluation
20
- The evaluation results of Hammer 2.0 series on the Berkeley Function-Calling Leaderboard (BFCL) are presented in the following table:
21
  <div style="text-align: center;">
22
  <img src="v2_figures/bfcl.PNG" alt="overview" width="1000" style="margin: auto;">
23
  </div>
 
24
 
 
25
 
26
- In addition, we evaluated Hammer2.0 on other academic benchmarks to further show our model's generalization ability:
27
  <div style="text-align: center;">
28
  <img src="v2_figures/others.PNG" alt="overview" width="1000" style="margin: auto;">
29
  </div>
30
-
31
- On comparison, Hammer 2.0 outperforms models with similar sizes and even surpass many larger models overall.
32
 
33
  ## Requiements
34
- The code of Hammer2.0-7b has been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
35
 
36
  ## How to Use
37
  This is a simple example of how to use our model.
@@ -39,18 +36,15 @@ This is a simple example of how to use our model.
39
  import json
40
  import torch
41
  from transformers import AutoModelForCausalLM, AutoTokenizer
42
-
43
  model_name = "MadeAgents/Hammer2.0-3b"
44
  model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
45
  tokenizer = AutoTokenizer.from_pretrained(model_name)
46
-
47
  # Please use our provided instruction prompt for best performance
48
  TASK_INSTRUCTION = """You are a tool calling assistant. In order to complete the user's request, you need to select one or more appropriate tools from the following tools and fill in the correct values for the tool parameters. Your specific tasks are:
49
  1. Make one or more function/tool calls to meet the request based on the question.
50
  2. If none of the function can be used, point it out and refuse to answer.
51
  3. If the given question lacks the parameters required by the function, also point it out.
52
  """
53
-
54
  FORMAT_INSTRUCTION = """
55
  The output MUST strictly adhere to the following JSON format, and NO other text MUST be included.
56
  The example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please directly output an empty list '[]'
@@ -61,10 +55,8 @@ The example format is as follows. Please make sure the parameter type is correct
61
  ]
62
  ```
63
  """
64
-
65
  # Define the input query and available tools
66
  query = "Where can I find live giveaways for beta access and games? And what's the weather like in New York, US?"
67
-
68
  live_giveaways_by_type = {
69
  "name": "live_giveaways_by_type",
70
  "description": "Retrieve live giveaways from the GamerPower API based on the specified type.",
@@ -108,7 +100,6 @@ get_stock_price={
108
  "required": ["ticker"]
109
  }
110
  }
111
-
112
  def convert_to_format_tool(tools):
113
  ''''''
114
  if isinstance(tools, dict):
@@ -141,12 +132,10 @@ def build_prompt(task_instruction: str, format_instruction: str, tools: list, qu
141
  openai_format_tools = [live_giveaways_by_type, get_current_weather,get_stock_price]
142
  format_tools = convert_to_format_tool(openai_format_tools)
143
  content = build_prompt(TASK_INSTRUCTION, FORMAT_INSTRUCTION, format_tools, query)
144
-
145
  messages=[
146
  { 'role': 'user', 'content': content}
147
  ]
148
  inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
149
-
150
  # tokenizer.eos_token_id is the id of <|EOT|> token
151
  outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
152
  print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
 
2
  license: cc-by-4.0
3
  datasets:
4
  - Salesforce/xlam-function-calling-60k
5
+ - MadeAgents/xlam-irrelevance-7.5k
6
  base_model:
7
  - Qwen/Qwen2.5-3B-Instruct
8
  ---
9
 
 
 
10
  ## Introduction
11
  We're excited to release lightweight Hammer 2.0 models ([0.5B](https://huggingface.co/MadeAgents/Hammer2.0-0.5b) , [1.5B](https://huggingface.co/MadeAgents/Hammer2.0-1.5b) , [3B](https://huggingface.co/MadeAgents/Hammer2.0-3b) , and [7B](https://huggingface.co/MadeAgents/Hammer2.0-7b)) with strong function calling capability, which empower developers to build personalized, on-device agentic applications.
12
 
13
  ## Model Details
14
+ Hammer2.0 finetuned based on [Qwen 2.5 series](https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e) and [Qwen 2.5 coder series](https://huggingface.co/collections/Qwen/qwen25-coder-66eaa22e6f99801bf65b0c2f) using function masking techniques. It's trained using the [APIGen Function Calling Datasets](https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k) containing 60,000 samples, supplemented by [xlam-irrelevance-7.5k](https://huggingface.co/datasets/MadeAgents/xlam-irrelevance-7.5k) we generated. Hammer2.0 has achieved exceptional performances across numerous function calling benchmarks. For detailed data construction, training methods, and evaluation strategies, please refer to our paper [Hammer: Robust Function-Calling for On-Device Language Models via Function Masking](https://arxiv.org/abs/2410.04587) and the [Hammer GitHub repository](https://github.com/MadeAgents/Hammer) .
 
15
 
16
  ## Evaluation
17
+ The evaluation results of Hammer 2.0 models on the Berkeley Function-Calling Leaderboard (BFCL-v3) are presented in the following table:
18
  <div style="text-align: center;">
19
  <img src="v2_figures/bfcl.PNG" alt="overview" width="1000" style="margin: auto;">
20
  </div>
21
+ Our Hammer 2.0 series consistently achieves corresponding best performance at comparable scales. The 7B model outperforms most function calling enchanced models, and the 1.5B model also achieves unexpected performance.
22
 
23
+ In addition, we evaluated the Hammer 2.0 models on other academic benchmarks to further demonstrate the generalization ability of our models.
24
 
 
25
  <div style="text-align: center;">
26
  <img src="v2_figures/others.PNG" alt="overview" width="1000" style="margin: auto;">
27
  </div>
28
+ Hammer 2.0 models showcase highly stable performance, suggesting the robustness of Hammer 2.0 series. In contrast, the baseline approaches display varying levels of effectiveness on these other benchmarks.
 
29
 
30
  ## Requiements
31
+ The code of Hammer 2.0 models have been in the latest Hugging face transformers and we advise you to install `transformers>=4.37.0`.
32
 
33
  ## How to Use
34
  This is a simple example of how to use our model.
 
36
  import json
37
  import torch
38
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
39
  model_name = "MadeAgents/Hammer2.0-3b"
40
  model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
41
  tokenizer = AutoTokenizer.from_pretrained(model_name)
 
42
  # Please use our provided instruction prompt for best performance
43
  TASK_INSTRUCTION = """You are a tool calling assistant. In order to complete the user's request, you need to select one or more appropriate tools from the following tools and fill in the correct values for the tool parameters. Your specific tasks are:
44
  1. Make one or more function/tool calls to meet the request based on the question.
45
  2. If none of the function can be used, point it out and refuse to answer.
46
  3. If the given question lacks the parameters required by the function, also point it out.
47
  """
 
48
  FORMAT_INSTRUCTION = """
49
  The output MUST strictly adhere to the following JSON format, and NO other text MUST be included.
50
  The example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please directly output an empty list '[]'
 
55
  ]
56
  ```
57
  """
 
58
  # Define the input query and available tools
59
  query = "Where can I find live giveaways for beta access and games? And what's the weather like in New York, US?"
 
60
  live_giveaways_by_type = {
61
  "name": "live_giveaways_by_type",
62
  "description": "Retrieve live giveaways from the GamerPower API based on the specified type.",
 
100
  "required": ["ticker"]
101
  }
102
  }
 
103
  def convert_to_format_tool(tools):
104
  ''''''
105
  if isinstance(tools, dict):
 
132
  openai_format_tools = [live_giveaways_by_type, get_current_weather,get_stock_price]
133
  format_tools = convert_to_format_tool(openai_format_tools)
134
  content = build_prompt(TASK_INSTRUCTION, FORMAT_INSTRUCTION, format_tools, query)
 
135
  messages=[
136
  { 'role': 'user', 'content': content}
137
  ]
138
  inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
 
139
  # tokenizer.eos_token_id is the id of <|EOT|> token
140
  outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
141
  print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))