Kaviel: Cyber Threat Intelligence Classification Model
Model Description
Kaviel is a fine-tuned version of roberta-base
designed for the classification of text into six categories: Banking Fraud, Terrorist Attack, Life Threat, Online Scams, Information Leakage, and Casual Conversation. This model is specifically trained for use in threat intelligence platforms.
Intended Use
The model is intended to help in automatically classifying textual data into predefined categories to assist in threat detection and management.
Training Data
The model was trained on a custom dataset with the following categories:
- Lable 0 Banking Fraud
- Lable 1 Terrorist Attack
- Lable 2 Life Threat
- Lable 3 Online Scams
- Lable 4 Information Leakage
- Lable 5 Casual Conversation
Training Procedure
The model was fine-tuned using PyTorch Lightning with the following configuration:
- Epochs: 3
- Batch size: 128
- Learning rate: 1.5e-6
- Weight decay: 0.001
- Warmup ratio: 0.2
Evaluation
The model's performance was evaluated using ROC AUC scores for each category.
How to Use
You can use the model for inference with the following code:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the model and tokenizer
model_name = "HiddenKise/Kaviel-threat-text-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example text for prediction
text = "Unauthorized access attempt detected. Verify your account now."
# Tokenize and prepare input
inputs = tokenizer(text, return_tensors="pt")
# Get model predictions
with torch.no_grad():
outputs = model(**inputs)
# Process outputs (assuming binary classification)
logits = outputs.logits
predictions = torch.sigmoid(logits)
# Print predictions
print(predictions)
- Downloads last month
- 15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.