Instructions to use renezander030/qwen-2.5-1.5b-de-pii-redactor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use renezander030/qwen-2.5-1.5b-de-pii-redactor with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct") model = PeftModel.from_pretrained(base_model, "renezander030/qwen-2.5-1.5b-de-pii-redactor") - Notebooks
- Google Colab
- Kaggle
qwen-2.5-1.5b-de-pii-redactor
A small, fast, DSGVO-konform PII redactor for German business documents — emails, support tickets, CRM notes, contracts, incident reports. Emits structured JSON (not raw spans) with a redacted text, a list of detected entities, a risk level, and a human-review flag. The entity catalog is built around German-specific identifiers (Steuer-ID, USt-IdNr, IBAN, Sozialversicherungsnummer, Handelsregisternummer, Kfz-Kennzeichen, Krankenversichertennummer) in a business context — deliberately not clinical, not span-only.
Why another PII model?
Generic multilingual PII detectors (GLiNER-PII, OpenMed-PII) are strong but either (a) span-only, (b) English-centric with German as an afterthought, or (c) clinical-only. This adapter fills the gap for German business documents with three specific design choices:
- German-specific identifier coverage. USt-IdNr, Steuer-ID, Sozialversicherungsnummer, Krankenversichertennummer, Handelsregisternummer, Kfz-Kennzeichen, IBAN — the IDs a German DSGVO-Audit actually asks about.
- Structured output contract. Pydantic-validated
RedactionResultwith redacted text, entity list,risk_levelandneeds_human_review— the shape downstream pipelines want to consume. No post-hoc span-reconstruction. - Small enough to self-host. Qwen2.5-1.5B + LoRA adapter fits on a consumer 24 GB card in 4-bit; on an A40 the bf16 path is fast. This is the model you run on-prem when your compliance team says "no customer PII leaves the building".
Entity catalog
| Type | What |
|---|---|
PERSON |
Vor- und Nachname einer natürlichen Person |
ADDRESS |
Postanschrift (Straße, PLZ, Ort) |
EMAIL |
E-Mail-Adresse |
PHONE |
Telefonnummer (mobil/Festnetz, international/national) |
DOB |
Geburtsdatum |
IBAN |
IBAN (DE + international) |
BIC |
Bank Identifier Code |
TAX_ID |
Steuer-Identifikationsnummer (11 Stellen, §139b AO) |
VAT_ID |
USt-IdNr (DE + 9 Stellen) |
SSN_DE |
Sozialversicherungsnummer (12 Zeichen) |
HEALTH_INSURANCE |
Krankenversichertennummer (10 Zeichen) |
ID_CARD |
Personalausweis- oder Reisepass-Nummer |
LICENSE_PLATE |
deutsches Kfz-Kennzeichen |
COMMERCIAL_REGISTER |
Handelsregisternummer (HRA/HRB) |
IP_ADDRESS |
IPv4 / IPv6 |
CUSTOMER_ID |
interne Kunden-, Bestell- oder Ticket-ID |
Output schema
from pydantic import BaseModel
from typing import Literal
EntityType = Literal[
"PERSON", "ADDRESS", "EMAIL", "PHONE", "DOB",
"IBAN", "BIC", "TAX_ID", "VAT_ID", "SSN_DE",
"HEALTH_INSURANCE", "ID_CARD", "LICENSE_PLATE",
"COMMERCIAL_REGISTER", "IP_ADDRESS", "CUSTOMER_ID",
]
class PiiEntity(BaseModel):
type: EntityType
value: str # original text
replacement: str # e.g. "[PERSON_1]"
class RedactionResult(BaseModel):
redacted_text: str
entities: list[PiiEntity]
risk_level: Literal["low", "medium", "high"]
needs_human_review: bool
Training data
- N: 75 train + 16 eval synthetic German business documents
- Generated by: Claude Opus, sampled across 6 document types (support email, CRM note, HR email, contract snippet, ops incident, legal intake) × 16 entity-combination templates
- Validated by:
RedactionResult.model_validate()plus span-level sanity checks (everyentity.valuemust occur in the input, everyentity.replacementmust occur inredacted_text); failures dropped - Open-source by design: all data is synthetic with fictional identifiers so the full corpus + training harness can be published without exposing real PII
Training setup
- Base:
Qwen/Qwen2.5-1.5B-Instruct(Apache 2.0, no gating) - Method: LoRA (PEFT) via TRL
SFTTrainer, conversational chat format - LoRA config: r=32, alpha=64, dropout=0.05, target_modules = attention + MLP projections (q/k/v/o, gate/up/down)
- Optimiser: AdamW (torch), cosine schedule, warmup ratio 0.03, learning rate 4e-4
- Batch: 4 per device × 4 grad-accum × bf16 × gradient checkpointing
- Epochs: 8
- Max seq len: 2048
- Hardware: single NVIDIA A40 (48 GB) on RunPod
- Wall time: 7 minutes
- GPU cost: about USD 0.05 for the training run
Evaluation
Computed on a held-out eval split via scripts/eval.py:
| Metric | Base Qwen2.5-1.5B | + LoRA adapter |
|---|---|---|
| Schema-valid JSON output | 81.2% | 100.0% |
| Entity micro-F1 (type + value) | 0.38 | 0.92 |
| Risk-level exact match | 50.0% | 93.8% |
| Needs-review exact match | 68.8% | 68.8% |
The fine-tune locks down the schema layer and the German-specific identifier recall at this small data scale. The free-form redacted text quality starts at usable and improves fast on real domain data — on a client engagement the same recipe runs against 2,000-10,000 of your actual documents, which closes the long tail of ambiguous names and rare identifier formats.
How to use
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = "Qwen/Qwen2.5-1.5B-Instruct"
adapter = "renezander030/qwen-2.5-1.5b-de-pii-redactor"
tok = AutoTokenizer.from_pretrained(base)
mdl = AutoModelForCausalLM.from_pretrained(base, device_map="auto")
mdl = PeftModel.from_pretrained(mdl, adapter).merge_and_unload()
# Same system prompt the adapter was trained on.
# See schema.py (render_system) for the exact text.
SYSTEM = "<redactor system prompt>"
USER = """Sehr geehrte Frau Schmidt,
anbei die Überweisung Ihrer Erstattung auf
IBAN DE89 3704 0044 0532 0130 00, Betrag 249,90 EUR.
Kundennummer: K-884421.
Bei Rückfragen erreichen Sie mich unter +49 30 12345678
oder j.weber@example.de.
Beste Grüße, Julia Weber"""
messages = [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": USER},
]
prompt = tok.apply_chat_template(messages, tokenize=False,
add_generation_prompt=True)
inputs = tok(prompt, return_tensors="pt").to(mdl.device)
out = mdl.generate(**inputs, max_new_tokens=1200, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:],
skip_special_tokens=True))
Expected output shape:
{
"redacted_text": "Sehr geehrte Frau [PERSON_1], ... IBAN [IBAN_1] ... Kundennummer: [CUSTOMER_ID_1] ... +49 [PHONE_1] oder [EMAIL_1]. Beste Grüße, [PERSON_2]",
"entities": [
{"type": "PERSON", "value": "Schmidt", "replacement": "[PERSON_1]"},
{"type": "IBAN", "value": "DE89 3704 0044 0532 0130 00", "replacement": "[IBAN_1]"},
{"type": "CUSTOMER_ID", "value": "K-884421", "replacement": "[CUSTOMER_ID_1]"},
{"type": "PHONE", "value": "30 12345678", "replacement": "[PHONE_1]"},
{"type": "EMAIL", "value": "j.weber@example.de", "replacement": "[EMAIL_1]"},
{"type": "PERSON", "value": "Julia Weber", "replacement": "[PERSON_2]"}
],
"risk_level": "high",
"needs_human_review": false
}
Deployment notes
- Footprint: ~140 MB safetensors adapter + ~11 MB tokenizer; ships as a one-folder plugin on top of any Qwen2.5-1.5B host.
- 4-bit (bitsandbytes nf4) brings the merged inference footprint to ~1 GB on a 24 GB consumer GPU; on A40 / A100 the bf16 path is faster.
- Batch throughput: swap
transformers.generate()for vLLM with the LoRA adapter loaded; expect 5-10× throughput for incoming ticket queues. - Pydantic validation at the boundary makes downstream pipelines fail-fast on schema drift. Pair with a simple re-ask on validation failure.
- On-prem: no data leaves your infrastructure. This is the point of running a small redactor instead of routing everything through a frontier API.
Limitations
- Small synthetic training set — schema layer and German-specific identifier recognition lock in at this size, but long-tail names, compound German surnames, and rare format variants need real data.
- Synthetic share is 100% in this open-source release; real business documents will expose failure modes around homonymy (Person vs. product name) and institutional identifiers that share formats with PII (e.g. contract numbers that look like customer IDs).
- Redaction is type + value + replacement, not character offsets.
If your pipeline requires offsets, reconstruct them from
valueoccurrences in the input text. - The model sees 2048 tokens at a time. For long contracts, chunk with a small overlap and merge entity lists.
- Not a compliance certification. This adapter helps your pipeline redact consistently and fast; your DPO still owns the legal call.
Work with me
This adapter is a public reference of the recipe I deliver to freelance clients: small, fast, GDPR-clean, structured-output LLMs trained on the domain data you already have.
If you need one of these, I can build it:
- a PII redactor trained on your own German documents (support tickets, CRM, contracts, medical, legal) for higher recall on your actual terminology
- a private LLM deployment on your infrastructure, or a dedicated cloud GPU endpoint
- a structured-output agent pipeline (LangGraph, Pydantic-validated, human-in-the-loop routing)
- an evaluation harness that tells you when the model is actually good enough to ship to production
Two ways to engage:
- Upwork — contract-ready, vetted, pay-as-you-go: https://www.upwork.com/freelancers/reneza
- Direct — for longer engagements, retainers, or a quick conversation: https://renezander.com
License
Adapter weights: Apache 2.0 (matches the Qwen2.5 base). Training scripts in the companion repo: MIT.
Citation
@misc{zander2026qwen15bdepiiredactor,
author = {Zander, Rene},
title = {qwen-2.5-1.5b-de-pii-redactor: a LoRA adapter
for DSGVO-konform PII redaction of German
business documents with structured JSON output},
year = {2026},
howpublished = {HuggingFace Hub},
url = {https://huggingface.co/renezander030/qwen-2.5-1.5b-de-pii-redactor},
}
- Downloads last month
- 5