Instructions to use ClarkBear/gemma4-e2b-mobile-actions-200 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ClarkBear/gemma4-e2b-mobile-actions-200 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ClarkBear/gemma4-e2b-mobile-actions-200")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("ClarkBear/gemma4-e2b-mobile-actions-200")
model = AutoModelForMultimodalLM.from_pretrained("ClarkBear/gemma4-e2b-mobile-actions-200")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ClarkBear/gemma4-e2b-mobile-actions-200 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ClarkBear/gemma4-e2b-mobile-actions-200"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ClarkBear/gemma4-e2b-mobile-actions-200",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ClarkBear/gemma4-e2b-mobile-actions-200

SGLang

How to use ClarkBear/gemma4-e2b-mobile-actions-200 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ClarkBear/gemma4-e2b-mobile-actions-200" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ClarkBear/gemma4-e2b-mobile-actions-200",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ClarkBear/gemma4-e2b-mobile-actions-200" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ClarkBear/gemma4-e2b-mobile-actions-200",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ClarkBear/gemma4-e2b-mobile-actions-200 with Docker Model Runner:
```
docker model run hf.co/ClarkBear/gemma4-e2b-mobile-actions-200
```

Gemma 4 E2B Mobile Actions, 200-Sample Fine-Tune

Main model card: google/gemma-4-E2B-it

This repository provides an experimental Gemma 4 E2B checkpoint fine-tuned on google/mobile-actions for mobile action function calling.

The model was trained locally on an Apple M1 Pro with LoRA, then merged back into the base model for direct Transformers inference.

What It Does

The model converts natural language mobile assistant requests into tool calls, for example:

<|tool_call>call:show_map{query:<|"|>Patisserie Valerie at 208 Kensington High Street, London, W8 7RG<|"|>}<tool|>

Supported functions come from the google/mobile-actions dataset, including:

create_calendar_event
create_contact
show_map
open_wifi_settings
send_email
turn_on_flashlight
turn_off_flashlight

Performance

Evaluation was run on 200 held-out examples from google/mobile-actions (metadata == "eval").

Metric	Score
Format valid rate	94.0%
Function name accuracy	94.0%
Required arguments present	92.0%
Exact match	85.0%

Per-Function Results

Function	N	Format	Name	Required	Exact
`create_contact`	43	88.4%	88.4%	88.4%	86.0%
`create_calendar_event`	42	90.5%	90.5%	81.0%	78.6%
`show_map`	38	97.4%	97.4%	97.4%	68.4%
`open_wifi_settings`	20	90.0%	90.0%	90.0%	90.0%
`send_email`	20	100.0%	100.0%	100.0%	95.0%
`turn_on_flashlight`	20	100.0%	100.0%	100.0%	100.0%
`turn_off_flashlight`	17	100.0%	100.0%	100.0%	100.0%

Training

Setting	Value
Base model	`google/gemma-4-E2B-it`
Dataset	`google/mobile-actions`
Training samples	200
Epochs	1
LoRA rank / alpha	16 / 32
Batch / accumulation	1 / 16
Max sequence length	1024
Precision	bfloat16
Hardware	Apple M1 Pro, MPS
Training time	35m47s including merge

The adapter was trained with TRL SFTTrainer and PEFT LoRA. The final weights in this repository are the merged model weights.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoProcessor, pipeline

model_id = "YOUR_USERNAME/gemma4-e2b-mobile-actions-200"

processor = AutoProcessor.from_pretrained(model_id)
tokenizer = processor.tokenizer
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    dtype=torch.bfloat16,
    device_map="auto",
)

tools = [{
    "type": "function",
    "function": {
        "name": "show_map",
        "description": "Shows a location on the map.",
        "parameters": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"],
        },
    },
}]

messages = [
    {"role": "system", "content": "You are a mobile assistant that calls tools."},
    {"role": "user", "content": "Show me Patisserie Valerie on Kensington High Street."},
]

prompt = processor.apply_chat_template(
    messages,
    tools=tools,
    tokenize=False,
    add_generation_prompt=True,
)

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(generator(prompt, max_new_tokens=160, do_sample=False)[0]["generated_text"])

Examples

This repository includes runnable examples:

python examples/run_transformers.py \
  --model-id ClarkBear/gemma4-e2b-mobile-actions-200 \
  --prompt "Turn on the flashlight"

Example prompts are available in examples/prompts.jsonl.

Limitations

This is an experimental small-data fine-tune.

It was trained on only 200 examples.
It may emit extra tool calls after the first valid tool call.
show_map exact match is sensitive to address string formatting.
Calendar events remain harder than simple device actions because they require title and datetime extraction.
This checkpoint is not compiled to LiteRT-LM or .litertlm.

Fine-Tune and Compile Notes

This repository contains a merged Hugging Face Transformers checkpoint. To deploy on-device with LiteRT-LM, an additional conversion and packaging step is required.

The local training scripts and evaluation code are maintained in the companion project used to create this checkpoint.

Downloads last month: 22

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for ClarkBear/gemma4-e2b-mobile-actions-200

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it