Instructions to use Bunnana/data-morph-gemma-2b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Bunnana/data-morph-gemma-2b with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("Bunnana/data-morph-gemma-2b")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

How to use Bunnana/data-morph-gemma-2b with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Bunnana/data-morph-gemma-2b"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Bunnana/data-morph-gemma-2b"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Bunnana/data-morph-gemma-2b with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "Bunnana/data-morph-gemma-2b"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Bunnana/data-morph-gemma-2b

Run Hermes

hermes

MLX LM

How to use Bunnana/data-morph-gemma-2b with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "Bunnana/data-morph-gemma-2b"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "Bunnana/data-morph-gemma-2b"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "Bunnana/data-morph-gemma-2b",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

Configuration Parsing Warning:In config.json: "num_experts" must be a number

data-morph-gemma-2b

A 2.0 GB local file-format–conversion model: a Gemma‑4 E2B student distilled from Claude Opus to convert between CSV, JSON, and TXT. Fine‑tuned with LoRA, then shrunk by stripping the unused vision/audio towers, pruning the vocabulary (262 k → 16 k), and quantizing to 8‑bit — 5.12 B → 2.05 B params, 9.6 GB → 2.0 GB.

This is not a general chat model. It is trained for one job: given a small metadata envelope describing a file, write a Python script that converts it. It is meant to be driven by the data-morph package, which runs the full pipeline around it.

How it works

Conversion is a five‑stage pipeline; the model never sees the full source file, only a compact metadata envelope (schema, samples, warnings):

[file] → 1. extract envelope → 3. THIS MODEL writes a Python script
       → 4. sandbox runs the script → 5. validate output → [converted file]

The model emits an <analysis>…</analysis> block followed by a <script>…</script> block. Narrowing the target from "transform a whole file" to "read metadata, write a script" is what makes a 2 B model viable, and lets the pipeline scale to arbitrary file sizes while leaving a readable, debuggable artefact (the script).

Intended use

In scope: CSV↔JSON conversion, JSON flattening, nested‑JSON construction, TXT log → CSV parsing, and schema migration — the five patterns it was distilled on.
Out of scope: open‑ended chat, formats other than CSV/JSON/TXT, and adversarial or far‑out‑of‑distribution inputs (a small model can be misled; the surrounding pipeline validates output and retries, but does not guarantee success).

Usage

Use via the pip package (recommended)

pip install "data-morph-gemma[mlx]"   # Apple Silicon + MLX

from datamorph import convert_file

result = convert_file("contacts.csv", "contacts.json")
print(result.accepted, result.scores, result.output_path)

convert_file runs the full pipeline (envelope → script → sandbox → validate) with a retry‑on‑error loop, so you get a validated output file, not just raw text. This model downloads automatically on first use (cached under ~/.cache/huggingface); set GEMMA_MLX_MODEL only if you want to point at a local copy instead.

Use directly with `mlx_lm`

from mlx_lm import load, generate
model, tok = load("Bunnana/data-morph-gemma-2b")
# Prompt = the script-generation instructions + the metadata envelope + the task.
# See the data-morph repo (skills/script_generation_teacher.md) for the exact contract;
# the model replies with <analysis>...</analysis><script>...</script>.

This is a text‑only build — load it with mlx_lm, not mlx_vlm.

Training

Teacher: Claude Opus + an Agent Skill, generating 800 programmatically‑verified training pairs (every pair passed format/schema/loadability/content checks before use).
Student: mlx-community/gemma-4-e2b-it-bf16, fine‑tuned with LoRA (mlx_vlm.lora, SFT, train‑on‑completions); the iter‑400 checkpoint was selected on held‑out eval.
Compression (W7): fuse the LoRA adapter → strip the vision + audio towers → prune the 262 k vocabulary to 16 k (the corpus uses ~4.5 k tokens; a tokenizer round‑trip gate guards the cut) → quantize to 8‑bit (group size 64).

Evaluation

Measured through the full pipeline on a 70‑case held‑out test set (content‑disjoint from training), scored on four metrics — Format Validity, Schema Compliance, Loadability, Content Accuracy.

Setting	Accepted (all 4 pass)	Score	vs. teacher
one‑shot	56 / 70	0.811	—
production (retry ≤ 3)	67 / 70	0.957	~96 %

The student clears the project's ≥ 80 %‑of‑teacher target on every metric.

Model details

Architecture: gemma4_text (text‑only), 2.05 B parameters
Quantization: 8‑bit affine, group size 64
Vocabulary: 16,384 (pruned from 262 k)
Context: inherits the base model's context length
Framework: MLX (Apple Silicon)

Limitations & ethics

A small model: reliable on the five trained conversion patterns; messy but in‑pattern inputs are handled well, far‑out‑of‑distribution ones may fail.
Hallucination / data‑loss risk is mitigated — not eliminated — by the pipeline's automated format/schema validation and retries.
Teacher bias from Claude Opus can propagate to the student.
Converted files may contain personal data; run locally and do not upload user inputs.

License

This model is a derivative of Google's Gemma and is distributed under the Gemma Terms of Use. By using it you agree to those terms, which propagate to derivatives. Base model: mlx-community/gemma-4-e2b-it-bf16.

Model tree for Bunnana/data-morph-gemma-2b

Base model

mlx-community/gemma-4-e2b-it-bf16

Adapter

(1)

this model

Bunnana
/

data-morph-gemma-2b

data-morph-gemma-2b

How it works

Intended use

Usage

Use via the pip package (recommended)

Use directly with `mlx_lm`

Training

Evaluation

Model details

Limitations & ethics

License

Links

Model tree for Bunnana/data-morph-gemma-2b

Dataset used to train Bunnana/data-morph-gemma-2b

data-morph-gemma-2b

How it works

Intended use

Usage

Use via the pip package (recommended)

Use directly with mlx_lm

Training

Evaluation

Model details

Limitations & ethics

License

Links

Model tree for Bunnana/data-morph-gemma-2b

Dataset used to train Bunnana/data-morph-gemma-2b

Use directly with `mlx_lm`