You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

PII-Qwen3.5-2B-LoRA-8bit

LoRA adapter for Qwen/Qwen3.5-2B that flags prompts containing PII, secrets, sensitive entities, and other content the LLM Guard secrets, sensitive, and anonymize scanners detect. The model is fine-tuned to emit a strict JSON object describing every violation found in the user prompt:

{"is_valid": false, "violations": {"EMAIL_ADDRESS": [[12, 29]], "IP_ADDRESS": [[40, 51]]}}

Quick start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch, json, re

BASE = "Qwen/Qwen3.5-2B"
ADAPTER = "Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit"

tokenizer = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True)
bnb = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(model, ADAPTER); model.eval()
ALL_VIOLATION_TYPES = [
    "AWSKeyDetector", "AzureStorageKeyDetector", "BasicAuthDetector",
    "DiscordBotTokenDetector", "GCPApiKeyDetector", "GitHubTokenCustomDetector",
    "GitLabDetector", "HuggingFaceDetector", "JWTBase64Detector", "JwtTokenDetector",
    "OpenAIApiKeyDetector", "PrivateKeyDetector", "SECRET", "SlackDetector",
    "StripeDetector", "TwilioKeyDetector",
    "BTC_ADDRESS", "CREDIT_CARD", "CRYPTO", "DATE_RE", "EMAIL_ADDRESS",
    "HEX_COLOR", "IBAN_CODE", "IP_ADDRESS", "LOCATION", "PERSON",
    "PHONE_NUMBER", "PHONE_NUMBER_WITH_EXT", "PHONE_NUMBER_ZH",
    "PO_BOX_RE", "PRICE_RE", "TIME_RE", "URL_RE",
    "US_BANK_NUMBER", "US_SSN", "UUID",
]
SYSTEM_MSG = (
    'You are a content guard. Output exactly one JSON object and nothing else: '
    '{"is_valid": <true|false>, "violations": {<TYPE>: [[start, end], ...], ...}}. '
    'No preamble. No explanation. No <think> tags. No markdown code fences. No trailing prose. '
    'Just the JSON, terminated immediately after the closing brace. '
    'is_valid is true only when violations is an empty object {}. Each [start, end] is a half-open '
    'character span pointing into the user prompt.Strictly the character span must only identify the exact violation string. Multiple occurrences of the same TYPE produce '
    "multiple spans in that TYPE's list. Allowed TYPE keys: "
    + ", ".join(ALL_VIOLATION_TYPES)
    + ".\n\nExamples:\n\n"
    "Input: Hello there, how are you?\n"
    'Output: {"is_valid": true, "violations": {}}\n\n'
    "Input: Email me at admin@example.com\n"
    'Output: {"is_valid": false, "violations": {"EMAIL_ADDRESS": [[12, 29]]}}\n\n'
    "Input: John lives at 192.168.1.1\n"
    'Output: {"is_valid": false, "violations": {"PERSON": [[0, 4]], "IP_ADDRESS": [[14, 25]]}}'
)
def guard(prompt: str) -> dict:
    chat = tokenizer.apply_chat_template(
        [{"role":"system","content":SYSTEM_MSG},
         {"role":"user","content":prompt}],
        tokenize=False, add_generation_prompt=True, enable_thinking=False)
    inputs = tokenizer(chat, return_tensors="pt").to(model.device)
    out = model.generate(**inputs, max_new_tokens=768, do_sample=False)
    text = tokenizer.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True)
    return json.loads(re.search(r'\{.*\}', text, re.DOTALL).group(0))

Evaluation

Evaluated on 100 held-out prompts

Evaluation timestamp: 2026-05-12 17:58 UTC
JSON parse errors: 0/100 (0.0%)

Top-level metrics

Metric	Value
`is_valid` accuracy	1.0000
Violation-type-set exact match	0.8400
Binary F1 (positive = invalid)	1.0000
Binary precision	1.0000
Binary recall	1.0000
Macro F1 across violation types	0.6831

Confusion matrix — binary `is_valid` decision

Positive class = the prompt contains a violation (is_valid=False).

	predicted invalid	predicted valid
actual invalid	TP = 50	FN = 0
actual valid	FP = 0	TN = 50

Per violation-type metrics

Only types that appear in either the actual or predicted labels are listed.

Type	support	precision	recall	F1
`EMAIL_ADDRESS`	5	1.000	1.000	1.000
`PERSON`	4	1.000	1.000	1.000
`PHONE_NUMBER`	4	1.000	1.000	1.000
`CREDIT_CARD`	3	1.000	1.000	1.000
`LOCATION`	3	1.000	0.667	0.800
`AWSKeyDetector`	2	1.000	0.500	0.667
`AzureStorageKeyDetector`	2	0.000	0.000	0.000
`GitHubTokenCustomDetector`	2	1.000	1.000	1.000
`JWTBase64Detector`	2	1.000	0.500	0.667
`JwtTokenDetector`	2	0.000	0.000	0.000
`OpenAIApiKeyDetector`	2	0.000	0.000	0.000
`PrivateKeyDetector`	2	1.000	1.000	1.000
`SlackDetector`	2	0.000	0.000	0.000
`StripeDetector`	2	1.000	0.500	0.667
`TwilioKeyDetector`	2	0.000	0.000	0.000
`DATE_RE`	2	1.000	1.000	1.000
`IP_ADDRESS`	2	1.000	1.000	1.000
`TIME_RE`	2	1.000	1.000	1.000
`URL_RE`	2	1.000	1.000	1.000
`US_SSN`	2	1.000	1.000	1.000
`BasicAuthDetector`	1	0.000	0.000	0.000
`DiscordBotTokenDetector`	1	0.000	0.000	0.000
`GCPApiKeyDetector`	1	1.000	1.000	1.000
`GitLabDetector`	1	1.000	1.000	1.000
`HuggingFaceDetector`	1	1.000	1.000	1.000
`SECRET`	1	0.067	1.000	0.125
`BTC_ADDRESS`	1	0.500	1.000	0.667
`CRYPTO`	1	0.000	0.000	0.000
`HEX_COLOR`	1	1.000	1.000	1.000
`IBAN_CODE`	1	1.000	1.000	1.000
`PHONE_NUMBER_WITH_EXT`	1	1.000	1.000	1.000
`PHONE_NUMBER_ZH`	1	1.000	1.000	1.000
`PO_BOX_RE`	1	1.000	1.000	1.000
`PRICE_RE`	1	1.000	1.000	1.000
`US_BANK_NUMBER`	1	1.000	1.000	1.000
`UUID`	1	0.000	0.000	0.000

Supported violation types

The model emits one or more of these TYPE keys in the violations map of its JSON output:

AWSKeyDetector, AzureStorageKeyDetector, BasicAuthDetector, DiscordBotTokenDetector, GCPApiKeyDetector, GitHubTokenCustomDetector, GitLabDetector, HuggingFaceDetector, JWTBase64Detector, JwtTokenDetector, OpenAIApiKeyDetector, PrivateKeyDetector, SECRET, SlackDetector, StripeDetector, TwilioKeyDetector, BTC_ADDRESS, CREDIT_CARD, CRYPTO, DATE_RE, EMAIL_ADDRESS, HEX_COLOR, IBAN_CODE, IP_ADDRESS, LOCATION, PERSON, PHONE_NUMBER, PHONE_NUMBER_WITH_EXT, PHONE_NUMBER_ZH, PO_BOX_RE, PRICE_RE, TIME_RE, URL_RE, US_BANK_NUMBER, US_SSN, UUID

Downloads last month: 27

Model tree for Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Adapter

(81)

this model

Evaluation results

is_valid accuracy on PII Guard Held-out Test Set
self-reported

1.000
violation-type-set exact match on PII Guard Held-out Test Set
self-reported

0.840
binary F1 (positive=invalid) on PII Guard Held-out Test Set
self-reported

1.000
macro F1 over violation types on PII Guard Held-out Test Set
self-reported

0.683
binary precision (positive=invalid) on PII Guard Held-out Test Set
self-reported

1.000
binary recall (positive=invalid) on PII Guard Held-out Test Set
self-reported

1.000