promptpurify model card

Tiny prompt-injection detector. ~14 MB. CPU. Built from scratch by SecureLayer7.

Intended use

Single-turn classification of untrusted text into benign vs prompt-injection. Sits between user input (or a retrieved RAG chunk, or a tool output) and your LLM call. Outputs a probability score; you decide the threshold and the policy.

import { createL5eRunner } from "promptpurify/l5";
const guard = await createL5eRunner();
const score = await guard.score(userMessage);
if (score >= 0.95) return refusal();

Full integration patterns: docs/QUICKSTART.md.

At a glance

Type ONNX transformer classifier
Size on disk ~14 MB (INT8)
Inference CPU, single-digit ms
Runtime onnxruntime-node (optional peer)
Network None. In-process.

Training

Built from scratch by SecureLayer7 on curated internal corpora.

Evaluation

Benchmarked against public datasets and OSS baselines. Comparison and methodology: docs/BENCHMARKS.md. Reproducibility: docs/REPRODUCE.md. Bench script scripts/bench.mjs re-scores the shipped public eval slice with this exact model artifact.

Out of scope

  • Single-turn scoring only โ€” pair with conversation-level monitoring.
  • Content moderation (toxicity, hate, CSAM, self-harm) โ€” pair with a content classifier.
  • Authentication and tool-scope enforcement are application responsibilities, not the model's.

See docs/HONEST-LIMITS.md.

Bias

The model is English-strongest. Operators serving multilingual traffic should calibrate the threshold per language. The model has no access to user identity, account state, or conversation history.

License

MIT for both the SDK and the model weights.

Public datasets we evaluate against (and the OSS baseline models we compare to) carry their own upstream licenses โ€” see training/CORPUS_LICENSES.json.

Integrity verification

Every model artifact is checksummed. Verify before extracting:

sha256sum -c models/l5e/SHA256SUMS

The release tarball is additionally cosign-signed with keyless Sigstore.

Distribution mirrors

Mirror URL
GitHub Releases https://github.com/securelayer7/PROMPTPurify/releases
Hugging Face Hub Securelayer7/promptpurify

Contact

Acknowledgments

Name and design philosophy inspired by DOMPurify by Cure53. Thanks to Mario Heiderich for suggesting the name.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support