gemma4-2b-markdown-review

Fine-tuned Gemma 4 E2B (2B) for automated Markdown document review โ€” catches missing alt text, broken links, stale filenames, and formatting issues.

Model details

Base model unsloth/gemma-4-E2B-it
Parameters 2B
Fine-tuning QLoRA (4-bit), rank 16
Training data 168 curated examples
Eval data 42 held-out examples
Epochs 3
Final val loss 0.546
Precision bfloat16
VRAM (training) ~6-8 GB (A100)

Usage

transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "izambasiron/gemma4-2b-markdown-review",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("izambasiron/gemma4-2b-markdown-review")

messages = [
    {"role": "system", "content": "You are a Markdown document reviewer..."},
    {"role": "user", "content": "# My Doc\n\n![img](./old.png)"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Ollama / llama.cpp

Download gemma-4-e2b-it.Q4_K_M.gguf and use with the included Modelfile:

ollama create gemma4-markdown-review -f Modelfile
ollama run gemma4-markdown-review

Or directly with llama.cpp:

llama-cli -m gemma-4-e2b-it.Q4_K_M.gguf -p "<system prompt>" -f input.md

Limitations

  • Trained on 168 examples โ€” may overfit to specific review patterns
  • English only
  • Not suitable for general chat โ€” task-specific fine-tune
  • Requires ~3 GB RAM for Q4_K_M GGUF; ~10 GB for fp16 transformers

Training

Fine-tuned with Unsloth using the notebook at colab_finetune_gemma4_clean.ipynb.

Downloads last month
74
Safetensors
Model size
5B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for izambasiron/gemma4-2b-markdown-review

Quantized
(18)
this model