qwen2.5-0.5b-funccall

A fine-tuned Qwen2.5-0.5B-Instruct that takes a user query plus a set of available tool/function schemas and outputs the correct function call(s) as clean, parseable JSON — no prose, no markdown fences. Trained as a cheap, accurate "router" model: given a natural-language request and a list of tools, it picks the right tool and fills in arguments correctly, so you don't need to call a much larger model on every turn.

Model details

Base model: unsloth/Qwen2.5-0.5B-Instruct
Method: LoRA fine-tuning via Unsloth, merged into the base weights
Training data: Salesforce/xlam-function-calling-60k — 60k function-calling examples, each verified through format checking, real function execution, and semantic verification
Task framing: system message lists available tools as JSON → user message is the natural-language query → assistant response is a JSON array of {"name": ..., "arguments": ...} objects, and only that

Intended use

Drop-in tool/function router for agent loops, CLI dispatchers, or any system that needs to map a user request to a structured function call without paying for a large general-purpose model on every request.

Why this model

Salesforce's own xLAM-1b-fc-r already showed that a sub-2B model can place competitively on the Berkeley Function-Calling Leaderboard (BFCL), outperforming several much larger general-purpose models. This model explores the same idea at an even smaller scale (0.5B), using Unsloth for fast LoRA fine-tuning.

Evaluation status

Not yet evaluated. Internal exact-match scoring (function name + argument match on a held-out split of the training data) is in progress. The model has not yet been benchmarked against Salesforce/xLAM-1b-fc-r or against larger zero-shot baselines (e.g. Qwen2.5-7B-Instruct) on the real BFCL harness. Numbers below will be filled in once that's run — treat any claims of "matching" or "beating" larger models as not yet verified until this section is updated.

Model	BFCL category	Accuracy
`nakue/qwen2.5-0.5b-funccall` (this model)	`simple`, `multiple`	pending
`Qwen2.5-7B-Instruct` (zero-shot)	`simple`, `multiple`	pending
`Salesforce/xLAM-1b-fc-r`	`simple`, `multiple`	pending

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch, json

model_id = "nakue/qwen2.5-0.5b-funccall"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {"location": {"type": "string"}},
            "required": ["location"],
        },
    }
]

system_msg = (
    "You are a function-calling assistant. Given a user query and a list of "
    "available tools, respond with ONLY a JSON array of the function call(s) "
    "needed to fulfill the query. Each item must have 'name' and 'arguments' "
    "keys. Do not include any explanation, markdown formatting, or text other "
    f"than the raw JSON array.\n\nAvailable tools:\n{json.dumps(tools, indent=2)}"
)

messages = [
    {"role": "system", "content": system_msg},
    {"role": "user", "content": "What's the weather like in Harare right now?"},
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(inputs, max_new_tokens=256, do_sample=False, pad_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(out[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)
# Expected: [{"name": "get_weather", "arguments": {"location": "Harare"}}]

Limitations

Trained only on simple and multiple-style single-turn function calling. Not trained or tested on multi-turn conversations, parallel calls, or queries with no matching tool (irrelevance detection).
Output is sensitive to tool-schema formatting; large or unusual schemas outside the training distribution may degrade reliability.
Evaluation against published baselines is pending — see above.

Training details

Fine-tuned with Unsloth's LoRA implementation (r=16, lora_alpha=16, targeting attention and MLP projection layers), 2 epochs, cosine LR schedule, on a held-out-respecting split of xlam-function-calling-60k (500 examples reserved for test, 300 for validation, remainder for training).

Citation

If you use this model, please also cite the underlying dataset:

@misc{xlam,
  title={xLAM: A Family of Large Action Models to Empower AI Agent Systems},
  author={Salesforce AI Research},
  year={2024}
}

Downloads last month: -

Safetensors

Model size

0.5B params

Tensor type

BF16

Model tree for nakue/qwen2.5-0.5b-funccall

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-0.5B-Instruct

Finetuned

unsloth/Qwen2.5-0.5B-Instruct

Adapter

(381)

this model

nakue
/

qwen2.5-0.5b-funccall