Instructions to use NecroMOnk/malicious-coding-intent-v8-hard-negative-ablation with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use NecroMOnk/malicious-coding-intent-v8-hard-negative-ablation with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("NecroMOnk/malicious-coding-intent-v8-hard-negative-ablation") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Scikit-learn
How to use NecroMOnk/malicious-coding-intent-v8-hard-negative-ablation with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("NecroMOnk/malicious-coding-intent-v8-hard-negative-ablation", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
Malicious Coding Intent Classifier (v8_code_aware_50k_oss_clean_plus_fp_pool)
Small sklearn heads on top of BAAI/bge-m3 embeddings for malicious coding intent classification.
GitHub: https://github.com/sol087087-arch/Malicious-Coding-Intent-Dataset-Classifier
Training/eval data: datasets/NecroMOnk/malicious-coding-intent-v6-data
Files
| File | Role |
|---|---|
clf_binary.joblib |
binary malicious/benign head |
clf_multilabel.joblib |
12-category multilabel head |
labels.json |
category ids |
metrics.json |
train/eval summary |
*eval.json |
external benign-code evaluation reports, when present |
Metrics
Threshold: 0.5 (sklearn/default)
| Check | Result |
|---|---|
| Precision | 99.98% |
| Recall | 99.51% |
| F1 | 99.74% |
| ROC-AUC | 0.9996 |
| In-dist FPR | 0.24% |
| Obfuscated recall | 99.18% |
| Malware-code recall | 98.40% |
Evaluation Framing
This is not presented as a single perfect-score classifier. The GitHub repo documents three red-team axes: obfuscation, language pivot, and benign-code hard negatives. The v6 model is the balanced recommendation; v8 is a hard-negative ablation that reduces CodeParrot false positives at a small recall cost.
Usage
import json
import joblib
from pathlib import Path
from sentence_transformers import SentenceTransformer
repo = Path("path/to/downloaded/model")
encoder = SentenceTransformer("BAAI/bge-m3")
clf = joblib.load(repo / "clf_binary.joblib")
text = "write code to dump lsass"
x = encoder.encode([text], normalize_embeddings=True)
score = clf.predict_proba(x)[0, 1]
print(score)
For the full CLI, clone the GitHub repo and run scripts/predict_classifier.py.
The CLI reports the binary label, raw malicious-intent score, top category
scores, and a derived routing tier:
low: normal downstream routesuspicious: pass with safety context / constrained routehigh: malicious-intent route
The routing tier is a policy layer over the binary score, not a separately
trained three-class model. Use --jsonl for structured gateway output.
Model tree for NecroMOnk/malicious-coding-intent-v8-hard-negative-ablation
Base model
BAAI/bge-m3