Text Classification
sentence-transformers
Safetensors
bert
cross-encoder
tool-use
alignment
safety
llm-safety
Eval Results (legacy)
text-embeddings-inference
Instructions to use acuvity/14.0.1-ms-marco-MiniLM-L6-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use acuvity/14.0.1-ms-marco-MiniLM-L6-v2 with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("acuvity/14.0.1-ms-marco-MiniLM-L6-v2") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
acuvity/intent-action
Model Description
acuvity/intent-action is a MiniLM-L6 Cross-Encoder fine-tuned for tool-use alignment detection -
classifying whether an AI agent's action is aligned with its given task.
- Architecture:
cross-encoder/ms-marco-MiniLM-L6-v2(22M parameters) - Task: Binary classification - aligned (0) vs misaligned (1)
- Training method: Supervised fine-tuning with Binary Cross-Entropy loss
- Loss: BCE
Training Hyperparameters
| Parameter | Value |
|---|---|
| Dataset | acuvity/tool-use-alignment |
| Epochs | 5 |
| Batch size | 32 |
| Learning rate | 1.00e-05 |
| Weight decay | 0.01 |
| Warmup ratio | 0.1 |
| LLRD | No |
| Early stopping patience | 3 |
| Eval every (steps) | 1000 |
Test Results
| Metric | Value |
|---|---|
| F1 | 0.9917 |
| AUPR | 0.9989 |
| Precision | 0.994 |
| Recall | 0.9895 |
| TP / TN / FP / FN | 8,394 / 8,398 / 71 / 75 (n=16,938) |
| Threshold (Ï„) | 0.8492 |
| Temperature (T) | 1 |
Training Curves
Calibration & Safety Gating
Each score maps to a confidence band for production use:
| Band | Condition | Action |
|---|---|---|
| SAFE | score < 0.8492 | Aligned — execute |
| Low confidence | 0.8492 ≤ score < 0.9995 | Warn / log |
| Medium confidence | 0.9995 ≤ score < 1.0005 | Ask confirmation (precision ≥ 85%) |
| High confidence | score ≥ 1.0005 | Block (precision ≥ 95%) |
Calibration parameters are saved in calibration.json.
Usage
from sentence_transformers import CrossEncoder
model = CrossEncoder("acuvity/intent-action", num_labels=1)
task = "Send an email to alice@example.com with subject 'Meeting'"
action = "[TOOL] send_email\n[ARGS] {\"to\": \"bob@example.com\", \"subject\": \"Meeting\"}"
score = model.predict([[task, action]])[0]
# Apply calibrated threshold
threshold = 0.8492
is_misaligned = score > threshold
print(f"Score: {score:.4f} | Misaligned: {is_misaligned}")
- Downloads last month
- 26
Model tree for acuvity/14.0.1-ms-marco-MiniLM-L6-v2
Base model
microsoft/MiniLM-L12-H384-uncased Quantized
cross-encoder/ms-marco-MiniLM-L12-v2 Quantized
cross-encoder/ms-marco-MiniLM-L6-v2Evaluation results
- F1 on Acuvity Tool-Use Alignment Datasettest set self-reported0.992
- Precision on Acuvity Tool-Use Alignment Datasettest set self-reported0.994
- Recall on Acuvity Tool-Use Alignment Datasettest set self-reported0.990

