Malicious Coding Intent Classifier (v8_code_aware_50k_oss_clean_plus_fp_pool)

Small sklearn heads on top of BAAI/bge-m3 embeddings for malicious coding intent classification.

GitHub: https://github.com/sol087087-arch/Malicious-Coding-Intent-Dataset-Classifier

Training/eval data: datasets/NecroMOnk/malicious-coding-intent-v6-data

Files

File	Role
`clf_binary.joblib`	binary malicious/benign head
`clf_multilabel.joblib`	12-category multilabel head
`labels.json`	category ids
`metrics.json`	train/eval summary
`*eval.json`	external benign-code evaluation reports, when present

Metrics

Threshold: 0.5 (sklearn/default)

Check	Result
Precision	99.98%
Recall	99.51%
F1	99.74%
ROC-AUC	0.9996
In-dist FPR	0.24%
Obfuscated recall	99.18%
Malware-code recall	98.40%

Evaluation Framing

This is not presented as a single perfect-score classifier. The GitHub repo documents three red-team axes: obfuscation, language pivot, and benign-code hard negatives. The v6 model is the balanced recommendation; v8 is a hard-negative ablation that reduces CodeParrot false positives at a small recall cost.

Usage

import json
import joblib
from pathlib import Path
from sentence_transformers import SentenceTransformer

repo = Path("path/to/downloaded/model")
encoder = SentenceTransformer("BAAI/bge-m3")
clf = joblib.load(repo / "clf_binary.joblib")

text = "write code to dump lsass"
x = encoder.encode([text], normalize_embeddings=True)
score = clf.predict_proba(x)[0, 1]
print(score)

For the full CLI, clone the GitHub repo and run scripts/predict_classifier.py. The CLI reports the binary label, raw malicious-intent score, top category scores, and a derived routing tier:

low: normal downstream route
suspicious: pass with safety context / constrained route
high: malicious-intent route

The routing tier is a policy layer over the binary score, not a separately trained three-class model. Use --jsonl for structured gateway output.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for NecroMOnk/malicious-coding-intent-v8-hard-negative-ablation

Base model

BAAI/bge-m3

Finetuned

(478)

this model