FunctionGemma 270M — LiteRT-LM (Android)

A LiteRT-LM export of google/functiongemma-270m-it, packaged for on-device function-calling on Android via Google's LiteRT-LM runtime. Weights are dynamic-range quantized to int8 (activations stay fp32) — the format LiteRT expects for XNNPACK acceleration.

The bundle ships a single .litertlm artifact that includes the transformer graph, tokenizer, and chat template, plus a small config.json that documents the workflow contract this build supports (single-turn calls and parallel calls; multi-step chaining / long-multi-turn slot-filling are out of scope for this version).

Model


Parameters	270M
Architecture	Gemma 3 (18 layers, 4 query heads, 1 KV head, head_dim 256, hidden 640)
Quantization	`dynamic_wi8_afp32` (int8 weights, fp32 activations)
Format	LiteRT-LM `.litertlm`
Context length	32,768 (model) / 1,024 (export)
Prefill chunk sizes	128, 512, 1024
File size	~283 MB
Backend	LiteRT CPU + XNNPACK

Files

File	Size	Description
`model.litertlm`	283 MB	LiteRT-LM bundle (weights + tokenizer + chat template). Pass the path to `litert_lm.Engine`.
`config.json`	~2 KB	Workflow contract — stop tokens, supported workflows, function-call output format.

Workflow contract

The export documents stop-token profiles and supported workflows in config.json. Key fields:

outputFormat: <start_function_call>call:name{arg:<escape>value<escape>}<end_function_call>
responseFormat: <start_function_response>response:name{arg:<escape>value<escape>}<end_function_response>
supportedWorkflows: single_turn_function_call, parallel_function_call
unsupportedWorkflows: multi_step_chaining, long_multi_turn_slot_filling
stopTokenProfiles.toolCall: <end_function_call>, <start_function_response>, <end_of_turn>
stopTokenProfiles.finalResponse: <end_of_turn>, <end_function_call>

Verification

The bundled validation (smoke_litert.py in the speech-android SDK) runs a two-pass tool call end-to-end and confirms the parsed call:

prompt:        "What is the current weather in Tokyo?"
raw call:      <start_function_call>call:get_current_weather{location:<escape>Tokyo<escape>}<end_function_call>
parsed call:   {"name": "get_current_weather", "arguments": {"location": "Tokyo"}}
tool result:   {"location": "Tokyo", "temperature": 15, "unit": "celsius", "condition": "sunny"}
final reply:   "The current weather in Tokyo is sunny with a temperature of 15.0 degrees Celsius."

A 4-test pytest suite (test_litert.py in the speech-android SDK) also passes (load + weather + timer-300-seconds + two-pass-with-final-response).

Usage

Python (via `litert_lm`)

import litert_lm
from functools import partial

def get_current_weather(location: str, unit: str = "celsius") -> dict:
    """Gets the current weather in a given location."""
    return {"location": location, "temperature": 15, "unit": unit, "condition": "sunny"}

TOOLS = {"get_current_weather": get_current_weather}

engine = litert_lm.Engine(
    model_path="model.litertlm",
    backend=litert_lm.Backend.CPU(),
    max_num_tokens=256,
)

with engine.create_conversation(
    tools=list(TOOLS.values()),
    automatic_tool_calling=False,
) as conv:
    first  = conv.send_message("What is the current weather in Tokyo?")
    # parse `<start_function_call>...<end_function_call>` from `first`,
    # call the matching tool from TOOLS, send the result back:
    final  = conv.send_message({"role": "tool", "content": [{"type": "tool_response", ...}]})

The full driver — parsing, tool dispatch, and the two-pass loop — is published as part of the speech-android SDK.

Android (Kotlin / Java)

Bundle model.litertlm in app/src/main/assets/ and load it with the LiteRT-LM Android runtime; see the speech-android SDK for the ready-to-use wrapper.

Source

Upstream model: google/functiongemma-270m-it — Gemma 3 270M instruction-tuned for structured function calls.

Model tree for soniqo/FunctionGemma-270M-LiteRT-LM

Base model

google/functiongemma-270m-it

Finetuned

(425)

this model

soniqo
/

FunctionGemma-270M-LiteRT-LM

FunctionGemma 270M — LiteRT-LM (Android)

Model

Files

Workflow contract

Verification

Usage

Python (via `litert_lm`)

Android (Kotlin / Java)

Source

Links

Model tree for soniqo/FunctionGemma-270M-LiteRT-LM

FunctionGemma 270M — LiteRT-LM (Android)

Model

Files

Workflow contract

Verification

Usage

Python (via litert_lm)

Android (Kotlin / Java)

Source

Links

Model tree for soniqo/FunctionGemma-270M-LiteRT-LM

Python (via `litert_lm`)