ToolGrad 4B

ToolGrad 4B is a fine-tuned version of google/gemma-3-4b-it optimized for function calling and tool-use tasks. It is trained on the dataset generated using the method described in our paper ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients" (ACL 2026 Finding). The codebase is available at our GitHub Repository.

Model Details

  • Developed by: Zhongyi Zhou
  • Model Type: Causal Language Model
  • Base Model: google/gemma-3-4b-it
  • License: gemma-terms-of-use

Intended Use

Single-turn tool-use tasks.

Evaluation Results

Evaluated on the Berkeley Function Calling Leaderboard (BFCL) v1 & v2:

Overall & Hallucination Scores

Model Non-live Live Halluc.
Overall Overall Rel. Irrel.
Gemma-3 4B 61.12% 60.84% 53.94% 100.00%
ToolGrad 4B 72.46% ↑ 65.58% ↑ 93.75% 59.27%

Detailed Category Scores

Model Non-live Live
Simple Multi Par MultiPar Simple Multi Par MultiPar
Gemma-3 4B 64.50% 88.00% 56.00% 36.00% 70.93% 59.35% 25.00% 41.67%
ToolGrad 4B 65.33% ↑ 86.50% ↓ 73.00% ↑ 65.00% ↑ 71.32% ↑ 64.86% ↑ 43.75% ↑ 50.00% ↑

How to Get Started

You can load this model using the transformers library:

import torch
from transformers import AutoModelForCausalLM, AutoProcessor

model_id = "zhongyi-zhou/toolgrad-4b"

processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

Citation

If you find this work helpful, please cite our paper:

@misc{zhou2026toolgradefficienttoolusedataset,
      title={ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"}, 
      author={Zhongyi Zhou and Kohei Uehara and Haoyu Zhang and Jingtao Zhou and Lin Gu and Ruofei Du and Zheng Xu and Tatsuya Harada},
      year={2026},
      eprint={2508.04086},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.04086}, 
}
Downloads last month
29
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zhongyi-zhou/toolgrad-4b

Finetuned
(712)
this model
Quantizations
1 model

Collection including zhongyi-zhou/toolgrad-4b

Paper for zhongyi-zhou/toolgrad-4b