Text Classification
Transformers
Safetensors
deberta-v2
policy-compliance
web-agents
deberta-v3
safety
st-webagentbench
Instructions to use superfunguy/pcm-benchmark-grounded-deberta with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use superfunguy/pcm-benchmark-grounded-deberta with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="superfunguy/pcm-benchmark-grounded-deberta")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("superfunguy/pcm-benchmark-grounded-deberta") model = AutoModelForSequenceClassification.from_pretrained("superfunguy/pcm-benchmark-grounded-deberta") - Notebooks
- Google Colab
- Kaggle
PCM Benchmark-Grounded DeBERTa
This model is a policy-compliance classifier for web-agent actions.
Input format
[POLICY] ... [SEP] [CONTEXT] ... [SEP] [ACTION] ...
Evaluation summary
Standard test
- Precision: 0.9972
- Recall: 1.0000
- F1: 0.9986
- FPR: 0.0028
- ROC-AUC: 1.0000
Challenge split
- Precision: 1.0000
- Recall: 0.8424
- F1: 0.9145
- FPR: 0.0000
- ROC-AUC: 0.9792
Notes
- Positive label:
1 = policy violation - Negative label:
0 = compliant action
- Downloads last month
- 22