Instructions to use ClarkBear/gemma4-e2b-mobile-actions-200 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ClarkBear/gemma4-e2b-mobile-actions-200 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ClarkBear/gemma4-e2b-mobile-actions-200") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("ClarkBear/gemma4-e2b-mobile-actions-200") model = AutoModelForMultimodalLM.from_pretrained("ClarkBear/gemma4-e2b-mobile-actions-200") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ClarkBear/gemma4-e2b-mobile-actions-200 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ClarkBear/gemma4-e2b-mobile-actions-200" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ClarkBear/gemma4-e2b-mobile-actions-200", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ClarkBear/gemma4-e2b-mobile-actions-200
- SGLang
How to use ClarkBear/gemma4-e2b-mobile-actions-200 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ClarkBear/gemma4-e2b-mobile-actions-200" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ClarkBear/gemma4-e2b-mobile-actions-200", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ClarkBear/gemma4-e2b-mobile-actions-200" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ClarkBear/gemma4-e2b-mobile-actions-200", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use ClarkBear/gemma4-e2b-mobile-actions-200 with Docker Model Runner:
docker model run hf.co/ClarkBear/gemma4-e2b-mobile-actions-200
Gemma 4 E2B Mobile Actions, 200-Sample Fine-Tune
Main model card: google/gemma-4-E2B-it
This repository provides an experimental Gemma 4 E2B checkpoint fine-tuned on
google/mobile-actions
for mobile action function calling.
The model was trained locally on an Apple M1 Pro with LoRA, then merged back into the base model for direct Transformers inference.
What It Does
The model converts natural language mobile assistant requests into tool calls, for example:
<|tool_call>call:show_map{query:<|"|>Patisserie Valerie at 208 Kensington High Street, London, W8 7RG<|"|>}<tool|>
Supported functions come from the google/mobile-actions dataset, including:
create_calendar_eventcreate_contactshow_mapopen_wifi_settingssend_emailturn_on_flashlightturn_off_flashlight
Performance
Evaluation was run on 200 held-out examples from google/mobile-actions
(metadata == "eval").
| Metric | Score |
|---|---|
| Format valid rate | 94.0% |
| Function name accuracy | 94.0% |
| Required arguments present | 92.0% |
| Exact match | 85.0% |
Per-Function Results
| Function | N | Format | Name | Required | Exact |
|---|---|---|---|---|---|
create_contact |
43 | 88.4% | 88.4% | 88.4% | 86.0% |
create_calendar_event |
42 | 90.5% | 90.5% | 81.0% | 78.6% |
show_map |
38 | 97.4% | 97.4% | 97.4% | 68.4% |
open_wifi_settings |
20 | 90.0% | 90.0% | 90.0% | 90.0% |
send_email |
20 | 100.0% | 100.0% | 100.0% | 95.0% |
turn_on_flashlight |
20 | 100.0% | 100.0% | 100.0% | 100.0% |
turn_off_flashlight |
17 | 100.0% | 100.0% | 100.0% | 100.0% |
Training
| Setting | Value |
|---|---|
| Base model | google/gemma-4-E2B-it |
| Dataset | google/mobile-actions |
| Training samples | 200 |
| Epochs | 1 |
| LoRA rank / alpha | 16 / 32 |
| Batch / accumulation | 1 / 16 |
| Max sequence length | 1024 |
| Precision | bfloat16 |
| Hardware | Apple M1 Pro, MPS |
| Training time | 35m47s including merge |
The adapter was trained with TRL SFTTrainer and PEFT LoRA. The final weights
in this repository are the merged model weights.
Usage
import torch
from transformers import AutoModelForCausalLM, AutoProcessor, pipeline
model_id = "YOUR_USERNAME/gemma4-e2b-mobile-actions-200"
processor = AutoProcessor.from_pretrained(model_id)
tokenizer = processor.tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_id,
dtype=torch.bfloat16,
device_map="auto",
)
tools = [{
"type": "function",
"function": {
"name": "show_map",
"description": "Shows a location on the map.",
"parameters": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"],
},
},
}]
messages = [
{"role": "system", "content": "You are a mobile assistant that calls tools."},
{"role": "user", "content": "Show me Patisserie Valerie on Kensington High Street."},
]
prompt = processor.apply_chat_template(
messages,
tools=tools,
tokenize=False,
add_generation_prompt=True,
)
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(generator(prompt, max_new_tokens=160, do_sample=False)[0]["generated_text"])
Examples
This repository includes runnable examples:
python examples/run_transformers.py \
--model-id ClarkBear/gemma4-e2b-mobile-actions-200 \
--prompt "Turn on the flashlight"
Example prompts are available in examples/prompts.jsonl.
Limitations
This is an experimental small-data fine-tune.
- It was trained on only 200 examples.
- It may emit extra tool calls after the first valid tool call.
show_mapexact match is sensitive to address string formatting.- Calendar events remain harder than simple device actions because they require title and datetime extraction.
- This checkpoint is not compiled to LiteRT-LM or
.litertlm.
Fine-Tune and Compile Notes
This repository contains a merged Hugging Face Transformers checkpoint. To deploy on-device with LiteRT-LM, an additional conversion and packaging step is required.
The local training scripts and evaluation code are maintained in the companion project used to create this checkpoint.
- Downloads last month
- 22