ToolGrad 12B

ToolGrad 12B is a fine-tuned version of google/gemma-3-12b-it optimized for function calling and tool-use tasks. It is trained on the dataset generated using the method described in our paper ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients" (ACL 2026 Finding). The codebase is available at our GitHub Repository.

Model Details

  • Developed by: Zhongyi Zhou
  • Model Type: Causal Language Model
  • Base Model: google/gemma-3-12b-it
  • License: gemma-terms-of-use

Intended Use

Single-turn tool-use tasks.

Evaluation Results

Evaluated on the Berkeley Function Calling Leaderboard (BFCL) v1 & v2:

Overall & Hallucination Scores

Model Non-live Live Halluc.
Overall Overall Rel. Irrel.
Gemma-3 12B 79.44% 74.24% 70.29% 93.75%
ToolGrad 12B 87.81% ↑ 78.46% ↑ 93.75% 59.27%

Detailed Category Scores

Model Non-live Live
Simple Multi Par MultiPar Simple Multi Par MultiPar
Gemma-3 12B 76.25% 94.00% 91.00% 56.50% 85.66% 71.89% 87.50% 45.83%
ToolGrad 12B 75.25% ↓ 94.00% ↑ 93.50% ↑ 88.50% ↑ 85.66% ↑ 77.11% ↑ 75.00% ↓ 62.50% ↑

How to Get Started

You can load this model using the transformers library:

import torch
from transformers import AutoModelForCausalLM, AutoProcessor

model_id = "zhongyi-zhou/toolgrad-12b"

processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

Citation

If you find this work helpful, please cite our paper:

@misc{zhou2026toolgradefficienttoolusedataset,
      title={ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"}, 
      author={Zhongyi Zhou and Kohei Uehara and Haoyu Zhang and Jingtao Zhou and Lin Gu and Ruofei Du and Zheng Xu and Tatsuya Harada},
      year={2026},
      eprint={2508.04086},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.04086}, 
}
Downloads last month
3
Safetensors
Model size
13B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zhongyi-zhou/toolgrad-12b

Finetuned
(357)
this model
Quantizations
1 model

Collection including zhongyi-zhou/toolgrad-12b

Paper for zhongyi-zhou/toolgrad-12b