Gemma 4 E2B Mobile Actions, 200-Sample Fine-Tune

Main model card: google/gemma-4-E2B-it

This repository provides an experimental Gemma 4 E2B checkpoint fine-tuned on google/mobile-actions for mobile action function calling.

The model was trained locally on an Apple M1 Pro with LoRA, then merged back into the base model for direct Transformers inference.

What It Does

The model converts natural language mobile assistant requests into tool calls, for example:

<|tool_call>call:show_map{query:<|"|>Patisserie Valerie at 208 Kensington High Street, London, W8 7RG<|"|>}<tool|>

Supported functions come from the google/mobile-actions dataset, including:

  • create_calendar_event
  • create_contact
  • show_map
  • open_wifi_settings
  • send_email
  • turn_on_flashlight
  • turn_off_flashlight

Performance

Evaluation was run on 200 held-out examples from google/mobile-actions (metadata == "eval").

Metric Score
Format valid rate 94.0%
Function name accuracy 94.0%
Required arguments present 92.0%
Exact match 85.0%

Per-Function Results

Function N Format Name Required Exact
create_contact 43 88.4% 88.4% 88.4% 86.0%
create_calendar_event 42 90.5% 90.5% 81.0% 78.6%
show_map 38 97.4% 97.4% 97.4% 68.4%
open_wifi_settings 20 90.0% 90.0% 90.0% 90.0%
send_email 20 100.0% 100.0% 100.0% 95.0%
turn_on_flashlight 20 100.0% 100.0% 100.0% 100.0%
turn_off_flashlight 17 100.0% 100.0% 100.0% 100.0%

Training

Setting Value
Base model google/gemma-4-E2B-it
Dataset google/mobile-actions
Training samples 200
Epochs 1
LoRA rank / alpha 16 / 32
Batch / accumulation 1 / 16
Max sequence length 1024
Precision bfloat16
Hardware Apple M1 Pro, MPS
Training time 35m47s including merge

The adapter was trained with TRL SFTTrainer and PEFT LoRA. The final weights in this repository are the merged model weights.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoProcessor, pipeline

model_id = "YOUR_USERNAME/gemma4-e2b-mobile-actions-200"

processor = AutoProcessor.from_pretrained(model_id)
tokenizer = processor.tokenizer
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    dtype=torch.bfloat16,
    device_map="auto",
)

tools = [{
    "type": "function",
    "function": {
        "name": "show_map",
        "description": "Shows a location on the map.",
        "parameters": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"],
        },
    },
}]

messages = [
    {"role": "system", "content": "You are a mobile assistant that calls tools."},
    {"role": "user", "content": "Show me Patisserie Valerie on Kensington High Street."},
]

prompt = processor.apply_chat_template(
    messages,
    tools=tools,
    tokenize=False,
    add_generation_prompt=True,
)

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(generator(prompt, max_new_tokens=160, do_sample=False)[0]["generated_text"])

Examples

This repository includes runnable examples:

python examples/run_transformers.py \
  --model-id ClarkBear/gemma4-e2b-mobile-actions-200 \
  --prompt "Turn on the flashlight"

Example prompts are available in examples/prompts.jsonl.

Limitations

This is an experimental small-data fine-tune.

  • It was trained on only 200 examples.
  • It may emit extra tool calls after the first valid tool call.
  • show_map exact match is sensitive to address string formatting.
  • Calendar events remain harder than simple device actions because they require title and datetime extraction.
  • This checkpoint is not compiled to LiteRT-LM or .litertlm.

Fine-Tune and Compile Notes

This repository contains a merged Hugging Face Transformers checkpoint. To deploy on-device with LiteRT-LM, an additional conversion and packaging step is required.

The local training scripts and evaluation code are maintained in the companion project used to create this checkpoint.

Downloads last month
22
Safetensors
Model size
5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ClarkBear/gemma4-e2b-mobile-actions-200

Finetuned
(245)
this model

Dataset used to train ClarkBear/gemma4-e2b-mobile-actions-200