FabGemma

FabGemma-12B

FabGemma-12B is an advanced, reasoning-first optimization of Google's Gemma 4 12B Instruct. It has been specifically fine-tuned to inject advanced agentic coding, autonomous task planning, and rigorous debugging workflows into the base model's standard instruction-following capabilities.

By utilizing supervised fine-tuning (SFT) on complex agentic traces, this model learns a crucial habit: it reasons and plans before it acts.


Core Highlights

  • Brain Upgrades: Modeled after complex, multi-step debugging and tool-use reasoning paths.
  • Base Architecture: google/gemma-4-12B-it (Dense Transformer).
  • Massive Context: Inherits Gemma 4's native 256K token context window.
  • Efficiency First: Trained using LoRA (merged directly into the final weights), modifying just 2.15% (~262M parameters) of the total network.

The Recipe: Dataset & Structure

FabGemma-12B was trained on 15.2 million tokens distilled directly from high-tier coding agent sessions.

  • Primary Source: Glint-Research/Fable-5-traces (4,665 total examples)
  • Targeting: Loss is selectively computed only on assistant completion tokens.

Dataset Characteristics

Attribute Metrics & Distribution
Total Examples 4,665 (with 100 held out for evaluation)
Average Sequence Length ~3.3K tokens
P99 Sequence Length ~9.2K tokens
Maximum Sequence Length ~24.9K tokens
Behavioral Mix 81% Tool-use interactions / 19% Direct text responses

Generative Framework

The model organizes its outputs into clear, cognitive steps. It will typically isolate its thought process using explicit XML-style formatting:

<think>
[Step-by-step problem dissection, edge-case identification, and tool strategy]
</think>

ASSISTANT (tool call) <Tool> input={...}

Training Blueprint

The fine-tuning phase utilized Unsloth, TRL, Transformers, and PEFT with the following configuration:

LoRA Configurations

  • Rank (r): 64
  • Alpha ($\alpha$): 128
  • Dropout: 0
  • Target Modules: q, k, v, o, gate, up, down

Optimization Passages

  • Epochs: 2
  • Learning Rate: 1e-4 (via Cosine Scheduler, 3% Warmup)
  • Effective Batch Size: 16
  • Training Sequence Cap: 16,384 tokens
  • Precision & Optimizer: bf16 utilizing AdamW (Weight decay: 0.01)

Evaluation & Performance

Validation metrics showed steady improvement across training epochs without any signs of degradation or collapse.

  • Final Training Loss: ~0.096
  • Validation Loss (Epoch 1): 0.785
  • Validation Loss (Epoch 2): 0.756

Benchmark Comparison (100 Held-Out Coding Traces)

When stacked against its own base model on 105,525 unseen response tokens, FabGemma-12B showed massive efficiency leaps in agentic workflows:

Performance Metric Base Model (gemma-4-12B-it) FabGemma-12B Net Improvement
Evaluation Loss 1.580 0.737 −53.4%
Perplexity 4.856 2.089 −57.0%
Mean Per-Example Loss 1.747 0.760 −56.5%

Quickstart Implementation

You can pull and deploy the merged checkpoint directly using Hugging Face transformers:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "naazimsnh02/FabGemma-12B"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

messages = [{"role": "user", "content":
    "USER: There's a failing test test_auth.py::test_expired_token. Investigate why and propose a fix."}]
inputs = tok.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

out = model.generate(inputs, max_new_tokens=512, do_sample=True,
                     temperature=0.7, top_p=0.9, repetition_penalty=1.05)  # rep-penalty avoids loops
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))

Important Limitations

Before dropping this model straight into a production pipeline, keep these architectural realities in mind:

  • Specialized Focus: Performance is heavily optimized for code architecture, script execution planning, and debugging. General trivia or encyclopedic factual knowledge may not match its engineering performance.
  • Modality Restraints: This is a strictly text-to-text asset. Core vision or audio capabilities have not been adapted.
  • Language & Formatting: Fine-tuning was executed primarily on English-centric environments. Output syntax remains highly dependent on user prompt structure.
  • Inherited Elements: Safety baselines, core biases, and underlying assumptions are inherited directly from the original google/gemma-4-12B-it foundation. Always vet code outputs before execution.

Provenance, Credits, & Licensing

  • Base Weights: Google Gemma Team (Gemma License)
  • Dataset Credits: Glint-Research/Fable-5-traces (AGPL-3.0)
  • Compliance Reminder: Because the training dataset is distilled from alternative AI assistant session logs, downstream practitioners must verify that their integration aligns with all relevant provider terms regarding derivative model training.

Disclaimer: This model checkpoint is experimental and provided "as-is" for research, local testing, and collaborative evaluation. There are no operational warranties attached to its outputs.

Downloads last month
13
Safetensors
Model size
12B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for naazimsnh02/FabGemma

Finetuned
(73)
this model
Quantizations
3 models

Dataset used to train naazimsnh02/FabGemma