🛡️ CyberConstituent-SLM

Constitutional AI-Aligned Cybersecurity Threat Classifier

A fine-tuned DistilBERT Small Language Model (SLM) engineered to accurately classify security logs and alerts into specific threat vectors. The training pipeline integrates an Anthropic-inspired Constitutional AI alignment layer, ensuring that raw threat descriptions are sanitized of explicit exploit payloads, SQL injection codes, and unverified geopolitical attribution bias.

🔗 Hugging Face Model Hub: sujithputta02/cyber-threat-constitutional-slm

🚀 Key Specifications

Base Architecture: distilbert-base-uncased (67M Parameters)
Task: 6-Class Single-Label Text Classification
Accuracy: 89% validation accuracy
Optimization: Fine-tuned on Google Colab T4 GPU using FP16 mixed-precision and Cosine learning rate scheduling.
Alignment: Aligned under Constitutional AI guidelines to filter out actionable exploit syntax while preserving analytical value.

🎯 Threat Classification Target Classes

The model classifies text inputs into one of six core cybersecurity threat categories:

Label ID	Threat Category	Example Indicators
LABEL_0	🦠 Malware Attack	Executables running from temp folder, unsigned dll files, keyloggers
LABEL_1	🔒 Ransomware Attack	Cryptographic file encryption, volume shadow copy deletions, ransom demand notes
LABEL_2	🎣 Phishing Campaign	Social engineering links, credential harvesting spoofed login portals, deceptive email macros
LABEL_3	💥 DDoS Attack	Massive SYN/UDP port flooding, network bandwidth exhaustion, botnet requests
LABEL_4	💉 SQL Injection	SQL command syntax in URL/query parameter, database validation form bypass
LABEL_5	🕵️ Man-in-the-Middle	ARP cache poisoning, rogue gate spoofing, SSL handshake intercept attempts

⚙️ Installation

To deploy or integrate this model on your platform, install the necessary dependencies:

pip install transformers torch

💻 Python Usage Examples

1. Simple Inference Pipeline (Quickest Integration)

Use Hugging Face's high-level pipeline to classify custom security logs directly:

from transformers import pipeline

# Load the model from the Hugging Face Hub
classifier = pipeline("text-classification", model="sujithputta02/cyber-threat-constitutional-slm")

# Human-readable labels dictionary
LABEL_MAP = {
    "LABEL_0": "🦠 Malware Attack",
    "LABEL_1": "🔒 Ransomware Attack",
    "LABEL_2": "🎣 Phishing Campaign",
    "LABEL_3": "💥 DDoS Attack",
    "LABEL_4": "💉 SQL Injection",
    "LABEL_5": "🕵️ Man-in-the-Middle Attack"
}

# Sample log to analyze
log = "ALERT: Database validation form bypass query manipulation. SQL syntax identified."

# Classify
result = classifier(log)[0]
confidence = result['score'] * 100
predicted_threat = LABEL_MAP.get(result['label'], result['label'])

print("=" * 60)
print(f"Input Log:        {log}")
print(f"Predicted Threat:  {predicted_threat}")
print(f"Confidence:        {confidence:.2f}%")
print("=" * 60)

2. Manual Tokenizer & Model Execution (Low-level Control)

For systems that require batched operations, model parameter tweaking, or tensor-level output routing:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("sujithputta02/cyber-threat-constitutional-slm")
model = AutoModelForSequenceClassification.from_pretrained("sujithputta02/cyber-threat-constitutional-slm")

# Prepare token streams
text = "Cryptographic file encryption activity detected in user directories. Bulk extension modification."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

# Run prediction
with torch.no_grad():
    logits = model(**inputs).logits

# Extract output distribution
probabilities = torch.nn.functional.softmax(logits, dim=-1)
predicted_class = torch.argmax(probabilities, dim=-1).item()

print(f"Class Probability Distribution: {probabilities[0].tolist()}")
print(f"Predicted Class ID: {predicted_class}")

🖥️ Web UI Dashboard Setup (Streamlit)

You can launch a live UI using the provided Streamlit app.py script. The UI connects directly to Hugging Face's serverless inference endpoint:

Install Streamlit:
```
pip install streamlit requests
```
Run the application:
```
streamlit run app.py
```

(Ensure you have set the HF_API_TOKEN under streamlit's environment variables or secrets for the API queries).

📜 Reproducibility & Model Info

The full training parameters, evaluation matrices, confusion plots, and learning curves are documented inside the SLM.ipynb training notebook in this repository. The model exports its final parameters directly inside cyber-threat-constitutional-slm/training_config.json.

Downloads last month: 46

Safetensors

Model size

67M params

Tensor type

F32

Model tree for sujithputta02/cyber-threat-constitutional-slm

Base model

distilbert/distilbert-base-uncased

Finetuned

(11856)

this model