🩸 Sangue e Grafi — Gemma 4 E2B SFT Adapter (v7)

Supervised Fine-Tuned LoRA adapter for Italian inheritance-law reasoning over kinship knowledge graphs.

Sangue e Grafi banner

Model Description

This is a LoRA (PEFT) adapter trained via Supervised Fine-Tuning (SFT) on top of google/gemma-4-E2B-it. It is part of the Sangue e Grafi project — a Hugging Face Build Small Hackathon 2026 entry demonstrating that a small 4B-parameter model, fine-tuned with SFT + GRPO and equipped with a knowledge-graph agent, outperforms frontier models (Gemini 2.5 Flash) on adversarial Italian inheritance-law scenarios.

The adapter teaches the model to:

  1. Parse complex kinship narratives in Italian.
  2. Emit structured tool calls (lookup_relationship, check_degree, etc.) grounded in an OWL kinship ontology.
  3. Reason step-by-step through multi-hop inheritance questions.

Training Details

Parameter Value
Method SFT (Supervised Fine-Tuning)
Base model google/gemma-4-E2B-it (4B params)
Training data 500 adversarial kinship scenarios with teacher traces
Teacher Gemini 2.5 Flash — generated gold reasoning traces
LoRA rank See adapter config
Format SafeTensors LoRA adapter

How the Data Was Generated

Each training example is a complete agent trace: a kinship scenario (family graph + narrative in Italian), a legal question, and the step-by-step tool-call reasoning produced by Gemini 2.5 Flash acting as a teacher over the ontology-grounded knowledge graph.

Benchmark Results 📊

Benchmark KG Agent (Gemma 4B SFT+GRPO) Gemini 2.5 Flash (no KG)
Easy (10 seeds) 10/10 (100%) 3/10 (30%)
Hard dev-set (10 seeds) 5/10 (50%)

Cross-Architecture Comparison

Model Hard Dev-Set Accuracy
Gemma 4B (SFT+GRPO) 5/10 (50%)
Nemotron 4B (SFT+GRPO) 4/10 (40%)

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("google/gemma-4-E2B-it")
model = PeftModel.from_pretrained(base, "cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E2B-it")

Note: This is the SFT-only checkpoint. For the full pipeline (SFT → GRPO), merge this adapter first, then apply the GRPO adapter.

Intended Uses & Limitations

Intended uses:

  • Research on knowledge-graph-grounded reasoning with small LMs
  • Benchmarking ontology-aware tool-use agents
  • Italian legal-reasoning demonstrations

Limitations:

  • Trained only on Italian kinship / inheritance-law scenarios
  • Requires the ontology-grounded KG agent framework to achieve reported results
  • Not a general-purpose Italian legal advisor

Project Links

Resource Link
🚀 Live Demo HF Space
📦 GitHub cyberandy/sangue-e-grafi
📄 Paper RLM-on-KG (arXiv:2604.17056)
🎯 GRPO Adapter sangue-e-grafi-gemma4-e2b-grpo-run-f-v7
📊 Agent Traces Dataset sangue-e-grafi-agent-traces
🔢 GGUF (quantized) sangue-e-grafi-gemma4-e2b-gguf

Citation

@misc{sangue-e-grafi-2026,
  title   = {Sangue e Grafi: Small Models Beat Frontier LLMs on Adversarial Kinship Reasoning with Knowledge Graph Agents},
  author  = {Andrea Volpini},
  year    = {2026},
  url     = {https://github.com/cyberandy/sangue-e-grafi},
  note    = {Hugging Face Build Small Hackathon 2026}
}
Downloads last month
84
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7

Adapter
(103)
this model

Dataset used to train cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7

Spaces using cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7 2

Paper for cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7

Evaluation results