Text Generation
PEFT
Safetensors
English
lora
qwen
qwen3
guardrails
pii-detection
content-moderation
quantization
8-bit precision
conversational
Eval Results (legacy)
Instructions to use Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-2B") model = PeftModel.from_pretrained(base_model, "Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit") - Notebooks
- Google Colab
- Kaggle
PII-Qwen3.5-2B-LoRA-8bit
LoRA adapter for Qwen/Qwen3.5-2B that flags prompts containing PII, secrets, sensitive entities, and other content the LLM Guard secrets, sensitive, and anonymize scanners detect.
The model is fine-tuned to emit a strict JSON object describing every violation found in the user prompt:
{"is_valid": false, "violations": {"EMAIL_ADDRESS": [[12, 29]], "IP_ADDRESS": [[40, 51]]}}
Quick start
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch, json, re
BASE = "Qwen/Qwen3.5-2B"
ADAPTER = "Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit"
tokenizer = AutoTokenizer.from_pretrained(BASE, trust_remote_code=True)
bnb = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(BASE, quantization_config=bnb, device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(model, ADAPTER); model.eval()
ALL_VIOLATION_TYPES = [
"AWSKeyDetector", "AzureStorageKeyDetector", "BasicAuthDetector",
"DiscordBotTokenDetector", "GCPApiKeyDetector", "GitHubTokenCustomDetector",
"GitLabDetector", "HuggingFaceDetector", "JWTBase64Detector", "JwtTokenDetector",
"OpenAIApiKeyDetector", "PrivateKeyDetector", "SECRET", "SlackDetector",
"StripeDetector", "TwilioKeyDetector",
"BTC_ADDRESS", "CREDIT_CARD", "CRYPTO", "DATE_RE", "EMAIL_ADDRESS",
"HEX_COLOR", "IBAN_CODE", "IP_ADDRESS", "LOCATION", "PERSON",
"PHONE_NUMBER", "PHONE_NUMBER_WITH_EXT", "PHONE_NUMBER_ZH",
"PO_BOX_RE", "PRICE_RE", "TIME_RE", "URL_RE",
"US_BANK_NUMBER", "US_SSN", "UUID",
]
SYSTEM_MSG = (
'You are a content guard. Output exactly one JSON object and nothing else: '
'{"is_valid": <true|false>, "violations": {<TYPE>: [[start, end], ...], ...}}. '
'No preamble. No explanation. No <think> tags. No markdown code fences. No trailing prose. '
'Just the JSON, terminated immediately after the closing brace. '
'is_valid is true only when violations is an empty object {}. Each [start, end] is a half-open '
'character span pointing into the user prompt.Strictly the character span must only identify the exact violation string. Multiple occurrences of the same TYPE produce '
"multiple spans in that TYPE's list. Allowed TYPE keys: "
+ ", ".join(ALL_VIOLATION_TYPES)
+ ".\n\nExamples:\n\n"
"Input: Hello there, how are you?\n"
'Output: {"is_valid": true, "violations": {}}\n\n'
"Input: Email me at admin@example.com\n"
'Output: {"is_valid": false, "violations": {"EMAIL_ADDRESS": [[12, 29]]}}\n\n'
"Input: John lives at 192.168.1.1\n"
'Output: {"is_valid": false, "violations": {"PERSON": [[0, 4]], "IP_ADDRESS": [[14, 25]]}}'
)
def guard(prompt: str) -> dict:
chat = tokenizer.apply_chat_template(
[{"role":"system","content":SYSTEM_MSG},
{"role":"user","content":prompt}],
tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(chat, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=768, do_sample=False)
text = tokenizer.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True)
return json.loads(re.search(r'\{.*\}', text, re.DOTALL).group(0))
Evaluation
Evaluated on 100 held-out prompts
- Evaluation timestamp:
2026-05-12 17:58 UTC - JSON parse errors:
0/100(0.0%)
Top-level metrics
| Metric | Value |
|---|---|
is_valid accuracy |
1.0000 |
| Violation-type-set exact match | 0.8400 |
| Binary F1 (positive = invalid) | 1.0000 |
| Binary precision | 1.0000 |
| Binary recall | 1.0000 |
| Macro F1 across violation types | 0.6831 |
Confusion matrix — binary is_valid decision
Positive class = the prompt contains a violation (is_valid=False).
| predicted invalid | predicted valid | |
|---|---|---|
| actual invalid | TP = 50 | FN = 0 |
| actual valid | FP = 0 | TN = 50 |
Per violation-type metrics
Only types that appear in either the actual or predicted labels are listed.
| Type | support | precision | recall | F1 |
|---|---|---|---|---|
EMAIL_ADDRESS |
5 | 1.000 | 1.000 | 1.000 |
PERSON |
4 | 1.000 | 1.000 | 1.000 |
PHONE_NUMBER |
4 | 1.000 | 1.000 | 1.000 |
CREDIT_CARD |
3 | 1.000 | 1.000 | 1.000 |
LOCATION |
3 | 1.000 | 0.667 | 0.800 |
AWSKeyDetector |
2 | 1.000 | 0.500 | 0.667 |
AzureStorageKeyDetector |
2 | 0.000 | 0.000 | 0.000 |
GitHubTokenCustomDetector |
2 | 1.000 | 1.000 | 1.000 |
JWTBase64Detector |
2 | 1.000 | 0.500 | 0.667 |
JwtTokenDetector |
2 | 0.000 | 0.000 | 0.000 |
OpenAIApiKeyDetector |
2 | 0.000 | 0.000 | 0.000 |
PrivateKeyDetector |
2 | 1.000 | 1.000 | 1.000 |
SlackDetector |
2 | 0.000 | 0.000 | 0.000 |
StripeDetector |
2 | 1.000 | 0.500 | 0.667 |
TwilioKeyDetector |
2 | 0.000 | 0.000 | 0.000 |
DATE_RE |
2 | 1.000 | 1.000 | 1.000 |
IP_ADDRESS |
2 | 1.000 | 1.000 | 1.000 |
TIME_RE |
2 | 1.000 | 1.000 | 1.000 |
URL_RE |
2 | 1.000 | 1.000 | 1.000 |
US_SSN |
2 | 1.000 | 1.000 | 1.000 |
BasicAuthDetector |
1 | 0.000 | 0.000 | 0.000 |
DiscordBotTokenDetector |
1 | 0.000 | 0.000 | 0.000 |
GCPApiKeyDetector |
1 | 1.000 | 1.000 | 1.000 |
GitLabDetector |
1 | 1.000 | 1.000 | 1.000 |
HuggingFaceDetector |
1 | 1.000 | 1.000 | 1.000 |
SECRET |
1 | 0.067 | 1.000 | 0.125 |
BTC_ADDRESS |
1 | 0.500 | 1.000 | 0.667 |
CRYPTO |
1 | 0.000 | 0.000 | 0.000 |
HEX_COLOR |
1 | 1.000 | 1.000 | 1.000 |
IBAN_CODE |
1 | 1.000 | 1.000 | 1.000 |
PHONE_NUMBER_WITH_EXT |
1 | 1.000 | 1.000 | 1.000 |
PHONE_NUMBER_ZH |
1 | 1.000 | 1.000 | 1.000 |
PO_BOX_RE |
1 | 1.000 | 1.000 | 1.000 |
PRICE_RE |
1 | 1.000 | 1.000 | 1.000 |
US_BANK_NUMBER |
1 | 1.000 | 1.000 | 1.000 |
UUID |
1 | 0.000 | 0.000 | 0.000 |
Supported violation types
The model emits one or more of these TYPE keys in the violations map of its JSON output:
AWSKeyDetector, AzureStorageKeyDetector, BasicAuthDetector, DiscordBotTokenDetector, GCPApiKeyDetector, GitHubTokenCustomDetector, GitLabDetector, HuggingFaceDetector, JWTBase64Detector, JwtTokenDetector, OpenAIApiKeyDetector, PrivateKeyDetector, SECRET, SlackDetector, StripeDetector, TwilioKeyDetector, BTC_ADDRESS, CREDIT_CARD, CRYPTO, DATE_RE, EMAIL_ADDRESS, HEX_COLOR, IBAN_CODE, IP_ADDRESS, LOCATION, PERSON, PHONE_NUMBER, PHONE_NUMBER_WITH_EXT, PHONE_NUMBER_ZH, PO_BOX_RE, PRICE_RE, TIME_RE, URL_RE, US_BANK_NUMBER, US_SSN, UUID
- Downloads last month
- 27
Model tree for Accuknoxtechnologies/PII-Qwen3.5-2B-LoRA-8bit
Evaluation results
- is_valid accuracy on PII Guard Held-out Test Setself-reported1.000
- violation-type-set exact match on PII Guard Held-out Test Setself-reported0.840
- binary F1 (positive=invalid) on PII Guard Held-out Test Setself-reported1.000
- macro F1 over violation types on PII Guard Held-out Test Setself-reported0.683
- binary precision (positive=invalid) on PII Guard Held-out Test Setself-reported1.000
- binary recall (positive=invalid) on PII Guard Held-out Test Setself-reported1.000