FunctionGemma 270M CoreML Stateful INT8

CoreML ML Program export of google/functiongemma-270m-it for on-device function calling.

  • INT8 linear weights, FP32 compute
  • 256-token K/V cache stored as CoreML MLState
  • Fixed specializations for 128-token padded prefill and 1-token decoding
  • Requires iOS 18 / macOS 15 or newer

The model uses FunctionGemma's compact function-call syntax. Parse and validate the generated call against an application-side allowlist before executing any tool.

Contents

  • FunctionGemmaStatefulDecoder.mlpackage: CoreML model package
  • config.json: runtime input and cache metadata
  • tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer assets

Validation

The export has been verified with an end-to-end weather tool call and a second-pass user-facing response after the tool result.

Runtime

Use the FunctionGemmaCoreMLStateful runtime in this repository's export directory. It performs a padded 128-token prefill followed by one-token decode calls, preserving MLState between invocations.

Downloads last month
21
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including aufklarer/FunctionGemma-270M-CoreML-INT8-FP32-Stateful