You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

PII-Qwen3.5-2B-LoRA-8bit

LoRA adapter for Qwen/Qwen3.5-2B that flags prompts containing PII, secrets, sensitive entities, and other content the LLM Guard secrets, sensitive, and anonymize scanners detect. The model is fine-tuned to emit a strict JSON object describing every violation found in the user prompt:

{"is_valid": false, "violations": {"EMAIL_ADDRESS": [[12, 29]], "IP_ADDRESS": [[40, 51]]}}

Quick start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch, json, re

BASE = "Qwen/Qwen3.5-2B"
ADAPTER = "Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit"

tokenizer = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True)
bnb = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(model, ADAPTER); model.eval()
ALL_VIOLATION_TYPES = [
    "AWSKeyDetector", "AzureStorageKeyDetector", "BasicAuthDetector",
    "DiscordBotTokenDetector", "GCPApiKeyDetector", "GitHubTokenCustomDetector",
    "GitLabDetector", "HuggingFaceDetector", "JWTBase64Detector", "JwtTokenDetector",
    "OpenAIApiKeyDetector", "PrivateKeyDetector", "SECRET", "SlackDetector",
    "StripeDetector", "TwilioKeyDetector",
    "BTC_ADDRESS", "CREDIT_CARD", "CRYPTO", "DATE_RE", "EMAIL_ADDRESS",
    "HEX_COLOR", "IBAN_CODE", "IP_ADDRESS", "LOCATION", "PERSON",
    "PHONE_NUMBER", "PHONE_NUMBER_WITH_EXT", "PHONE_NUMBER_ZH",
    "PO_BOX_RE", "PRICE_RE", "TIME_RE", "URL_RE",
    "US_BANK_NUMBER", "US_SSN", "UUID",
]
SYSTEM_MSG = (
    'You are a content guard. Output exactly one JSON object and nothing else: '
    '{"is_valid": <true|false>, "violations": {<TYPE>: [[start, end], ...], ...}}. '
    'No preamble. No explanation. No <think> tags. No markdown code fences. No trailing prose. '
    'Just the JSON, terminated immediately after the closing brace. '
    'is_valid is true only when violations is an empty object {}. Each [start, end] is a half-open '
    'character span pointing into the user prompt.Strictly the character span must only identify the exact violation string. Multiple occurrences of the same TYPE produce '
    "multiple spans in that TYPE's list. Allowed TYPE keys: "
    + ", ".join(ALL_VIOLATION_TYPES)
    + ".\n\nExamples:\n\n"
    "Input: Hello there, how are you?\n"
    'Output: {"is_valid": true, "violations": {}}\n\n'
    "Input: Email me at admin@example.com\n"
    'Output: {"is_valid": false, "violations": {"EMAIL_ADDRESS": [[12, 29]]}}\n\n'
    "Input: John lives at 192.168.1.1\n"
    'Output: {"is_valid": false, "violations": {"PERSON": [[0, 4]], "IP_ADDRESS": [[14, 25]]}}'
)
def guard(prompt: str) -> dict:
    chat = tokenizer.apply_chat_template(
        [{"role":"system","content":SYSTEM_MSG},
         {"role":"user","content":prompt}],
        tokenize=False, add_generation_prompt=True, enable_thinking=False)
    inputs = tokenizer(chat, return_tensors="pt").to(model.device)
    out = model.generate(**inputs, max_new_tokens=768, do_sample=False)
    text = tokenizer.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True)
    return json.loads(re.search(r'\{.*\}', text, re.DOTALL).group(0))

Evaluation

Evaluated on 100 held-out prompts

  • Evaluation timestamp: 2026-05-12 17:58 UTC
  • JSON parse errors: 0/100 (0.0%)

Top-level metrics

Metric Value
is_valid accuracy 1.0000
Violation-type-set exact match 0.8400
Binary F1 (positive = invalid) 1.0000
Binary precision 1.0000
Binary recall 1.0000
Macro F1 across violation types 0.6831

Confusion matrix — binary is_valid decision

Positive class = the prompt contains a violation (is_valid=False).

predicted invalid predicted valid
actual invalid TP = 50 FN = 0
actual valid FP = 0 TN = 50

Per violation-type metrics

Only types that appear in either the actual or predicted labels are listed.

Type support precision recall F1
EMAIL_ADDRESS 5 1.000 1.000 1.000
PERSON 4 1.000 1.000 1.000
PHONE_NUMBER 4 1.000 1.000 1.000
CREDIT_CARD 3 1.000 1.000 1.000
LOCATION 3 1.000 0.667 0.800
AWSKeyDetector 2 1.000 0.500 0.667
AzureStorageKeyDetector 2 0.000 0.000 0.000
GitHubTokenCustomDetector 2 1.000 1.000 1.000
JWTBase64Detector 2 1.000 0.500 0.667
JwtTokenDetector 2 0.000 0.000 0.000
OpenAIApiKeyDetector 2 0.000 0.000 0.000
PrivateKeyDetector 2 1.000 1.000 1.000
SlackDetector 2 0.000 0.000 0.000
StripeDetector 2 1.000 0.500 0.667
TwilioKeyDetector 2 0.000 0.000 0.000
DATE_RE 2 1.000 1.000 1.000
IP_ADDRESS 2 1.000 1.000 1.000
TIME_RE 2 1.000 1.000 1.000
URL_RE 2 1.000 1.000 1.000
US_SSN 2 1.000 1.000 1.000
BasicAuthDetector 1 0.000 0.000 0.000
DiscordBotTokenDetector 1 0.000 0.000 0.000
GCPApiKeyDetector 1 1.000 1.000 1.000
GitLabDetector 1 1.000 1.000 1.000
HuggingFaceDetector 1 1.000 1.000 1.000
SECRET 1 0.067 1.000 0.125
BTC_ADDRESS 1 0.500 1.000 0.667
CRYPTO 1 0.000 0.000 0.000
HEX_COLOR 1 1.000 1.000 1.000
IBAN_CODE 1 1.000 1.000 1.000
PHONE_NUMBER_WITH_EXT 1 1.000 1.000 1.000
PHONE_NUMBER_ZH 1 1.000 1.000 1.000
PO_BOX_RE 1 1.000 1.000 1.000
PRICE_RE 1 1.000 1.000 1.000
US_BANK_NUMBER 1 1.000 1.000 1.000
UUID 1 0.000 0.000 0.000

Supported violation types

The model emits one or more of these TYPE keys in the violations map of its JSON output:

AWSKeyDetector, AzureStorageKeyDetector, BasicAuthDetector, DiscordBotTokenDetector, GCPApiKeyDetector, GitHubTokenCustomDetector, GitLabDetector, HuggingFaceDetector, JWTBase64Detector, JwtTokenDetector, OpenAIApiKeyDetector, PrivateKeyDetector, SECRET, SlackDetector, StripeDetector, TwilioKeyDetector, BTC_ADDRESS, CREDIT_CARD, CRYPTO, DATE_RE, EMAIL_ADDRESS, HEX_COLOR, IBAN_CODE, IP_ADDRESS, LOCATION, PERSON, PHONE_NUMBER, PHONE_NUMBER_WITH_EXT, PHONE_NUMBER_ZH, PO_BOX_RE, PRICE_RE, TIME_RE, URL_RE, US_BANK_NUMBER, US_SSN, UUID
Downloads last month
27
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit

Finetuned
Qwen/Qwen3.5-2B
Adapter
(81)
this model

Evaluation results

  • is_valid accuracy on PII Guard Held-out Test Set
    self-reported
    1.000
  • violation-type-set exact match on PII Guard Held-out Test Set
    self-reported
    0.840
  • binary F1 (positive=invalid) on PII Guard Held-out Test Set
    self-reported
    1.000
  • macro F1 over violation types on PII Guard Held-out Test Set
    self-reported
    0.683
  • binary precision (positive=invalid) on PII Guard Held-out Test Set
    self-reported
    1.000
  • binary recall (positive=invalid) on PII Guard Held-out Test Set
    self-reported
    1.000