🩸 Sangue e Grafi — Gemma 4 E2B SFT Adapter (v7)

Supervised Fine-Tuned LoRA adapter for Italian inheritance-law reasoning over kinship knowledge graphs.

Model Description

This is a LoRA (PEFT) adapter trained via Supervised Fine-Tuning (SFT) on top of google/gemma-4-E2B-it. It is part of the Sangue e Grafi project — a Hugging Face Build Small Hackathon 2026 entry demonstrating that a small 4B-parameter model, fine-tuned with SFT + GRPO and equipped with a knowledge-graph agent, outperforms frontier models (Gemini 2.5 Flash) on adversarial Italian inheritance-law scenarios.

The adapter teaches the model to:

Parse complex kinship narratives in Italian.
Emit structured tool calls (lookup_relationship, check_degree, etc.) grounded in an OWL kinship ontology.
Reason step-by-step through multi-hop inheritance questions.

Training Details

Parameter	Value
Method	SFT (Supervised Fine-Tuning)
Base model	`google/gemma-4-E2B-it` (4B params)
Training data	500 adversarial kinship scenarios with teacher traces
Teacher	Gemini 2.5 Flash — generated gold reasoning traces
LoRA rank	See adapter config
Format	SafeTensors LoRA adapter

How the Data Was Generated

Each training example is a complete agent trace: a kinship scenario (family graph + narrative in Italian), a legal question, and the step-by-step tool-call reasoning produced by Gemini 2.5 Flash acting as a teacher over the ontology-grounded knowledge graph.

Benchmark Results 📊

Benchmark	KG Agent (Gemma 4B SFT+GRPO)	Gemini 2.5 Flash (no KG)
Easy (10 seeds)	10/10 (100%)	3/10 (30%)
Hard dev-set (10 seeds)	5/10 (50%)	—

Cross-Architecture Comparison

Model	Hard Dev-Set Accuracy
Gemma 4B (SFT+GRPO)	5/10 (50%)
Nemotron 4B (SFT+GRPO)	4/10 (40%)

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("google/gemma-4-E2B-it")
model = PeftModel.from_pretrained(base, "cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-E2B-it")

Note: This is the SFT-only checkpoint. For the full pipeline (SFT → GRPO), merge this adapter first, then apply the GRPO adapter.

Intended Uses & Limitations

Intended uses:

Research on knowledge-graph-grounded reasoning with small LMs
Benchmarking ontology-aware tool-use agents
Italian legal-reasoning demonstrations

Limitations:

Trained only on Italian kinship / inheritance-law scenarios
Requires the ontology-grounded KG agent framework to achieve reported results
Not a general-purpose Italian legal advisor

Project Links

Resource	Link
🚀 Live Demo	HF Space
📦 GitHub	cyberandy/sangue-e-grafi
📄 Paper	RLM-on-KG (arXiv:2604.17056)
🎯 GRPO Adapter	sangue-e-grafi-gemma4-e2b-grpo-run-f-v7
📊 Agent Traces Dataset	sangue-e-grafi-agent-traces
🔢 GGUF (quantized)	sangue-e-grafi-gemma4-e2b-gguf

Citation

@misc{sangue-e-grafi-2026,
  title   = {Sangue e Grafi: Small Models Beat Frontier LLMs on Adversarial Kinship Reasoning with Knowledge Graph Agents},
  author  = {Andrea Volpini},
  year    = {2026},
  url     = {https://github.com/cyberandy/sangue-e-grafi},
  note    = {Hugging Face Build Small Hackathon 2026}
}

Downloads last month: 84

Model tree for cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Adapter

(103)

this model

Dataset used to train cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7

Spaces using cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7 2

Paper for cyberandy/sangue-e-grafi-gemma4-e2b-sft-adversarial-v7

RLM-on-KG: Heuristics First, LLMs When Needed: Adaptive Retrieval Control over Mention Graphs for Scattered Evidence

Paper • 2604.17056 • Published Apr 18

Evaluation results

Easy Benchmark Accuracy (Agent)
self-reported

100.000
Hard Dev-Set Accuracy (Agent)
self-reported

50.000