Claro 4B

Claro is a fine-tuned Gemma 3 4B Instruct that rewrites complex English at CEFR A2 (elementary) level while preserving the source's facts. Trained on Apple Silicon with MLX (LoRA), via SFT followed by RL (GSPO) against a decomposed, mostly-deterministic reward.

Format note: this is an MLX model (converted base: mlx-community/gemma-3-4b-it-bf16). It loads with mlx_lm on Apple Silicon. It is not a transformers/PyTorch checkpoint. The repo also ships the LoRA adapter under adapter/ for applying on top of the base yourself.

Usage (MLX)

The model expects its chat template with the training system prompt — a raw prompt string will make it ramble. Replicate the training invocation:

from mlx_lm import load, generate

model, tok = load("miguelconner4/claro")

SYSTEM = ("Rewrite the user's text in CEFR A2 (Elementary English): short simple "
          "sentences, basic vocabulary, no idioms. Keep all important facts. "
          "Output only the rewritten text.")

complex_text = "The edifice, constructed circa 1750, was subsequently designated a historic landmark."
prompt = tok.apply_chat_template(
    [{"role": "system", "content": SYSTEM},
     {"role": "user", "content": complex_text}],
    tokenize=False, add_generation_prompt=True,
)
print(generate(model, tok, prompt=prompt, max_tokens=512, verbose=False))
# -> "The building was built around 1750. People decided it was important history."

How it was trained

  1. SFT on ~1,500 (complex → A2) paragraph pairs distilled from a frontier model over random Wikipedia paragraphs, filtered by LLM judges.

  2. RL (200 iters, group size 8, GSPO sequence-level importance sampling, KL β=0.1) from the SFT checkpoint, against a cardinal multiplicative reward:

    reward = level_band × vocab × fidelity × format_gates (each ∈ [0,1]).

    • level_band — deterministic A2 difficulty: readability (Flesch), mean sentence length, passive and subordination density, with bands calibrated to the 10th–90th percentiles of real A2 reference texts.
    • vocab — penalty for off-A2-list words, with gloss-aware exemption (defining a hard term in-line is not penalized).
    • fidelity — LLM judge, decomposed into fact-level recall + hallucination counts (not a holistic score).
    • format_gates — hard pass/fail for markdown / degenerate loops.

Evaluation (30 held-out Wikipedia paragraphs)

On 30 held-out Wikipedia paragraphs, Claro's rewrites land at ~70% CEFR A2 (most of the rest A1; only ~7% drift up to the harder B1), per a DeepSeek mode-of-3 CEFR classifier — reliably simpler than the model it was fine-tuned from, with fewer too-hard outputs. Faithfulness is preserved: source-fact recall stays ~0.98, and a strict hand-audit (counting only real contradictions and fabricated facts, notparaphrase or omission) found ~3–4 genuine errors across the 30 paragraphs —indistinguishable from the baseline and consistent across three independent judge families (Haiku, GPT-4o, Gemini). So the GSPO step delivered a real gain in simplicity at no measurable cost to accuracy. The few remaining errors are subtle — dropped qualifiers or reversed relations (e.g. "younger"→"older sister").

Limitations

  • MLX-only (Apple Silicon). No PyTorch/transformers weights provided.
  • Evaluated at n=30; CEFR classification is genuinely noisy at the A2/B1 boundary (judges agree with a strict reference only ~50% of the time there). Treat the numbers as ±10pp.
  • ~1 in 10 outputs carries a genuine fidelity slip (≈3–4 per 30 in our audit). The dominant mode is subtle attribute/relation errors (e.g. "younger"↔"older sister", a dropped qualifier), not wholesale fabrication.
  • English-only; tuned on encyclopedic prose. Out-of-domain text (dialogue, code, poetry) is untested.

License

Derivative of Google's Gemma 3; use is governed by the Gemma Terms of Use and the Gemma Prohibited Use Policy, which carry over to this model.

Downloads last month
28
Safetensors
Model size
5B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for miguelconner4/claro

Adapter
(381)
this model