Text Generation
MLX
Safetensors
English
Chinese
gemma3_text
function-calling
gemma3
bfloat16
conversational
Instructions to use DarylFranxx/functiongemma-270m-mika with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use DarylFranxx/functiongemma-270m-mika with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("DarylFranxx/functiongemma-270m-mika") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
- Pi new
How to use DarylFranxx/functiongemma-270m-mika with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "DarylFranxx/functiongemma-270m-mika"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "DarylFranxx/functiongemma-270m-mika" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use DarylFranxx/functiongemma-270m-mika with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "DarylFranxx/functiongemma-270m-mika"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default DarylFranxx/functiongemma-270m-mika
Run Hermes
hermes
- MLX LM
How to use DarylFranxx/functiongemma-270m-mika with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "DarylFranxx/functiongemma-270m-mika"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "DarylFranxx/functiongemma-270m-mika" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DarylFranxx/functiongemma-270m-mika", "messages": [ {"role": "user", "content": "Hello"} ] }'
metadata
language:
- en
- zh
license: apache-2.0
base_model: lmstudio-community/functiongemma-270m-it-MLX-bf16
tags:
- text-generation
- function-calling
- gemma3
- mlx
- bfloat16
pipeline_tag: text-generation
functiongemma-270m-mika
基于 lmstudio-community/functiongemma-270m-it-MLX-bf16 修改发布。
架构信息
| 参数 | 值 |
|---|---|
| 架构 | Gemma3ForCausalLM |
| 参数量 | ~270M |
| Hidden Size | 640 |
| 层数 | 18 (sliding + full attention混合) |
| 词表大小 | 262,144 |
| 最大上下文 | 32,768 tokens |
| 精度 | bfloat16 |
修改内容
- ✅ 优化了生成参数配置(temperature/top_p/top_k)
- ✅ 新增
generation_config.json - ✅ 优化了支持 Function Calling 的 Chat Template
- ✅ 提升 max_length 至 8192
快速开始
MLX 方式(Mac Apple Silicon 推荐)
from mlx_lm import load, generate
model, tokenizer = load("DarylFranxx/functiongemma-270m-mika")
# 普通对话
response = generate(model, tokenizer,
prompt="你好,请介绍一下自己",
max_tokens=256)
print(response)
Function Calling 示例
from mlx_lm import load, generate
model, tokenizer = load("DarylFranxx/functiongemma-270m-mika")
tools = [
{
"name": "get_weather",
"description": "获取指定城市的天气信息",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "城市名称"},
"date": {"type": "string", "description": "日期,格式YYYY-MM-DD"}
},
"required": ["city"]
}
}
]
messages = [
{"role": "user", "content": "北京今天天气怎么样?"}
]
# 使用tokenizer的chat_template格式化
prompt = tokenizer.apply_chat_template(
messages,
tools=tools,
tokenize=False,
add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)
Transformers 方式
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "DarylFranxx/functiongemma-270m-mika"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [{"role": "user", "content": "Hello!"}]
inputs = tokenizer.apply_chat_template(
messages, return_tensors="pt", add_generation_prompt=True
).to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
注意事项
- 需要
transformers >= 4.57.3才支持Gemma3ForCausalLM - MLX格式仅适用于 Apple Silicon (M1/M2/M3/M4)
- 遵守原始模型 Apache 2.0 许可证