Qwen PII Guard (LoRA adapter)

Fine-tuned from Qwen/Qwen3.5-2B to detect personally-identifiable information in user prompts and emit a single JSON object listing the values found in each of 15 categories.

Output schema:

{"is_valid": true,
 "category": {"Name": ["John Doe"], "Email": ["john@example.com"]}}

is_valid is false and category is {} when the prompt contains no PII.

Categories

name, email, phone_number, address, date, national_id, passport_number, drivers_license, tax_id, card_number, bank_account, credentials, ip_address, username

Evaluation (transformers)

  • test rows: 200 (held-out, from test_dataset_pii.csv)
  • is_valid accuracy: 1.0000
  • category key-set accuracy: 0.9350
  • category value-set accuracy: 0.8300
  • binary F1 (is_valid): 1.0000 (P=1.000 R=1.000)
  • macro F1 over categories (key-presence): 0.9791
  • macro F1 over categories (value-set): 0.9529
  • parse errors: 0/200

Binary confusion matrix (positive = "contains PII"):

predicted PII predicted clean
actual PII 177 0
actual clean 0 23

Per-category KEY-presence (did the model emit this category at all?):

Category Support Precision Recall F1
address 79 0.987 0.987 0.987
bank_account 12 1.000 1.000 1.000
card_number 25 1.000 1.000 1.000
credentials 10 1.000 1.000 1.000
date 95 1.000 1.000 1.000
drivers_license 27 0.957 0.815 0.880
email 76 0.987 1.000 0.993
ip_address 9 1.000 1.000 1.000
name 107 1.000 0.991 0.995
national_id 52 0.911 0.981 0.944
passport_number 21 0.955 1.000 0.977
phone_number 63 1.000 0.984 0.992
tax_id 24 0.920 0.958 0.939
username 9 1.000 1.000 1.000

Per-category VALUE-set (did the exact strings match within the category?):

Category Support (string-spans) Precision Recall F1
address 79 0.924 0.924 0.924
bank_account 12 1.000 1.000 1.000
card_number 26 1.000 1.000 1.000
credentials 10 1.000 1.000 1.000
date 123 1.000 1.000 1.000
drivers_license 27 0.957 0.815 0.880
email 82 0.988 1.000 0.994
ip_address 9 1.000 1.000 1.000
name 242 0.863 0.835 0.849
national_id 59 0.869 0.898 0.883
passport_number 21 0.955 1.000 0.977
phone_number 65 0.984 0.969 0.977
tax_id 24 0.840 0.875 0.857
username 9 1.000 1.000 1.000

Latency (transformers, single-prompt, greedy decoding):

mean median p95 max
3.15s 2.77s 6.45s 9.82s

Quick start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-2B"
tok = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype="auto", device_map="auto", trust_remote_code=True)
model = PeftModel.from_pretrained(base, "Accuknoxtechnologies/PII-Qwen3.5-2B-adapter-v8")

Evaluation โ€” vLLM serving (merged model, text-only)

Same 200 held-out prompts, served through vLLM 0.21.0 instead of the transformers .generate() loop. Greedy decoding, dtype bf16, enable_prefix_caching=True, enable_chunked_prefill=True. This reflects production serving accuracy + latency.

  • JSON parse errors: 0/200 (0.0%)

Accuracy (vLLM)

Metric Value
is_valid accuracy 1.0000
category key-set accuracy 0.9350
category value-set accuracy 0.8300
Binary F1 (positive = contains PII) 1.0000
Binary precision 1.0000
Binary recall 1.0000
Macro F1 (key-presence) 0.9791
Macro F1 (value-set) 0.9529

Confusion matrix โ€” binary is_valid (vLLM)

predicted PII predicted clean
actual PII TP = 177 FN = 0
actual clean FP = 0 TN = 23

Per-category key-presence (vLLM)

Category Support Precision Recall F1
address 79 0.987 0.987 0.987
bank_account 12 1.000 1.000 1.000
card_number 25 1.000 1.000 1.000
credentials 10 1.000 1.000 1.000
date 95 1.000 1.000 1.000
drivers_license 27 0.957 0.815 0.880
email 76 0.987 1.000 0.993
ip_address 9 1.000 1.000 1.000
name 107 1.000 0.991 0.995
national_id 52 0.911 0.981 0.944
passport_number 21 0.955 1.000 0.977
phone_number 63 1.000 0.984 0.992
tax_id 24 0.920 0.958 0.939
username 9 1.000 1.000 1.000

vLLM inference latency (single-stream, batch = 1)

Stat ms / prompt
Mean 576.0
Median 511.6
p95 1151.7
p99 1440.7
Max 3209.3
Under 1 s 89.0%

vLLM throughput (single batched submit)

  • Prompts/sec: 27.73
  • Output tokens/sec: 1569.0
  • Input tokens/sec: 35596.5
  • Batched wall time for all 200 prompts: 7.21 s

Card generated at 2026-05-31 07:39 UTC. Adapter weights: Accuknoxtechnologies/PII-Qwen3.5-2B-adapter-v8.

Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Accuknoxtechnologies/PII-Qwen3.5-2B-adapter-v8

Finetuned
Qwen/Qwen3.5-2B
Adapter
(93)
this model