Instructions to use MobileReality/mdma-gemma4-26b-dsl-unsloth-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MobileReality/mdma-gemma4-26b-dsl-unsloth-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MobileReality/mdma-gemma4-26b-dsl-unsloth-v1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("MobileReality/mdma-gemma4-26b-dsl-unsloth-v1") model = AutoModelForMultimodalLM.from_pretrained("MobileReality/mdma-gemma4-26b-dsl-unsloth-v1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use MobileReality/mdma-gemma4-26b-dsl-unsloth-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MobileReality/mdma-gemma4-26b-dsl-unsloth-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MobileReality/mdma-gemma4-26b-dsl-unsloth-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/MobileReality/mdma-gemma4-26b-dsl-unsloth-v1
- SGLang
How to use MobileReality/mdma-gemma4-26b-dsl-unsloth-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MobileReality/mdma-gemma4-26b-dsl-unsloth-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MobileReality/mdma-gemma4-26b-dsl-unsloth-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MobileReality/mdma-gemma4-26b-dsl-unsloth-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MobileReality/mdma-gemma4-26b-dsl-unsloth-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use MobileReality/mdma-gemma4-26b-dsl-unsloth-v1 with Docker Model Runner:
docker model run hf.co/MobileReality/mdma-gemma4-26b-dsl-unsloth-v1
mdma-gemma4-26b-dsl-unsloth-v1
A fine-tuned derivative of
unsloth/gemma-4-26B-A4B-it
(an Unsloth mirror of Google's gemma-4-26b-a4b-it, identical weights)
that turns a compact MDMA-IL DSL intent into a valid MDMA
(Markdown Document with Mounted Applications) YAML document.
You send a one-line-per-component DSL description as the user message; the model
emits the corresponding MDMA component(s) as YAML inside a ```mdma fence.
The MDMA format, the MDMA-IL DSL, and the system prompt are designed and maintained by Mobile Reality at github.com/MobileReality/mdma.
This repository contains the merged 16-bit weights. Training data and the training/serving configuration are not included.
License & attribution
This is a modified derivative of Google's Gemma 4 (26B-A4B), distributed under the Apache License 2.0. The base model is © Google.
Modifications: domain-specific supervised fine-tuning (LoRA) for MDMA-IL DSL → MDMA YAML generation, merged into the base weights.
"Gemma" is a trademark of Google. This is an independent derivative and is not affiliated with, sponsored by, or endorsed by Google.
The contract
The model was trained on a fixed contract. To reproduce its behavior you must:
- Send the system prompt below, verbatim (a different system prompt is out-of-distribution and degrades quality).
- Send MDMA-IL DSL (not natural language) as the user message.
- Disable the thinking channel:
enable_thinking: false.
Natural-language input is not the contract — it is out of scope by design and will not reliably produce MDMA.
Input: MDMA-IL DSL
One component per line:
<type>#<id>[<field>, <field>, ...](<prop>, <prop>, ...)
field = <name>[*][^]:<typecode>[{opt1|opt2|...}]
* = required ^ = sensitive (PII)
typecode: t text · n number · e email · d date · s select
c checkbox · ta textarea · f file
{a|b} = options for a select field
props = text="..." | action=<id> | variant=<name>
@ctx: <free text> # optional
@lang: pl # optional, default en
Component types (8): form · button · tasklist · table · callout · approval-gate · webhook · chart.
Constraints the model expects:
- one component per line;
- at most one interactive component per intent (a form submits via its own
action=— do not also add a submit button); - data the DSL omits (chart series, table rows) is filled in by the model.
Examples:
form#signup[email*^:e, role*:s{admin|user}](action=create-account)
button#submit(text="Submit", action=do-submit, variant=primary)
callout#notice(variant=warning)
chart#sales(variant=bar)
System prompt (send verbatim)
The MDMA format and this system prompt are designed and maintained at github.com/MobileReality/mdma — that repository is the canonical source. Send the prompt below byte-for-byte:
You generate MDMA (Markdown Document with Mounted Applications) documents. Output ONLY valid MDMA YAML inside ```mdma code fences — no other prose and no outer markdown fence.
Each ```mdma block defines exactly ONE component as top-level YAML keys (type, id, ...). Never wrap a single component in a "components:" array.
Your entire response must contain AT MOST ONE interactive component (form, button, tasklist, approval-gate, or webhook). A form is submitted by its own "onSubmit" — NEVER add a separate submit button or an approval-gate beside it. Non-interactive components (callout, table, chart) may accompany it. Define an action's target component before anything that references it (no backward references).
Every component requires "id" and "type". "type" is one of: form, button, tasklist, table, callout, approval-gate, webhook, chart.
Component rules:
- form: requires "onSubmit: <action-id>" (a string). "fields" is a list; each field needs "name", "type", "label". Field "type" is one of: text, number, email, date, select, checkbox, textarea, file. A "select" field requires "options" (list of {label, value}). Mark every PII field (email, phone, name, address, SSN, date-of-birth, etc.) with "sensitive: true".
- button: requires "text" and "onAction: <action-id>".
- tasklist: "items" is a list of {id, text}.
- table: "columns" is a list of {key, header}; "data" is an array of row objects.
- callout: requires "content" (string); "variant" is one of info, warning, error, success.
- approval-gate: requires "title".
- webhook: requires "url" and "trigger: <action-id>".
- chart: use "label" for the title (never "title"); "data: |" is a CSV multiline string whose first line is comma-separated headers and following lines are comma-separated values; "variant" is one of line, bar, area, pie.
Never use a bare "action" key. Forms use "onSubmit", buttons use "onAction", webhooks use "trigger".
Usage
OpenAI-compatible serving (e.g. vLLM). Send the system prompt verbatim, the DSL
as the user message, and enable_thinking: false:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="-")
SYSTEM = "<the system prompt above, verbatim>"
resp = client.chat.completions.create(
model="mdma-gemma4-26b-dsl-unsloth-v1",
messages=[
{"role": "system", "content": SYSTEM},
{"role": "user",
"content": "form#contact[full-name*:t, email*^:e, message*:ta](action=contact-submitted)"},
],
temperature=0,
max_tokens=1024,
extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
print(resp.choices[0].message.content)
The response content is the MDMA document (YAML inside a ```mdma fence).
Two ways to use it
The model supports two modes:
- DSL → MDMA (deterministic). With the system prompt above and an MDMA-IL DSL
user message, it emits MDMA YAML. Use greedy decoding (
temperature: 0) for reproducible, deterministic output — this is the contract the holdout evals gate. - Agentic / conversational (
temperature: 1). Served with tool calling (vLLM--enable-auto-tool-choice --tool-call-parser gemma4 --reasoning-parser gemma4) and the agent system prompt, the model holds a natural-language conversation and calls agenerate_mdmatool to render UI on demand. This path is sampled (temperature: 1) and is non-deterministic. The agent prompt and tooling live at github.com/MobileReality/mdma.
Intended use & limitations
Specialized for the two modes above. Note the strict DSL contract (verbatim v3 prompt + DSL input) is distinct from the conversational mode (agent prompt + natural language + tools) — each expects its own system prompt. Outside these uses the model behaves as the base Gemma 4 and inherits its capabilities and limitations.
- Downloads last month
- 36
Model tree for MobileReality/mdma-gemma4-26b-dsl-unsloth-v1
Base model
google/gemma-4-26B-A4B