Edit model card

Kaviel: Cyber Threat Intelligence Classification Model

Model Description

Kaviel is a fine-tuned version of roberta-base designed for the classification of text into six categories: Banking Fraud, Terrorist Attack, Life Threat, Online Scams, Information Leakage, and Casual Conversation. This model is specifically trained for use in threat intelligence platforms.

Intended Use

The model is intended to help in automatically classifying textual data into predefined categories to assist in threat detection and management.

Training Data

The model was trained on a custom dataset with the following categories:

  • Lable 0 Banking Fraud
  • Lable 1 Terrorist Attack
  • Lable 2 Life Threat
  • Lable 3 Online Scams
  • Lable 4 Information Leakage
  • Lable 5 Casual Conversation

Training Procedure

The model was fine-tuned using PyTorch Lightning with the following configuration:

  • Epochs: 3
  • Batch size: 128
  • Learning rate: 1.5e-6
  • Weight decay: 0.001
  • Warmup ratio: 0.2

Evaluation

The model's performance was evaluated using ROC AUC scores for each category.

How to Use

You can use the model for inference with the following code:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load the model and tokenizer
model_name = "HiddenKise/Kaviel-threat-text-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example text for prediction
text = "Unauthorized access attempt detected. Verify your account now."

# Tokenize and prepare input
inputs = tokenizer(text, return_tensors="pt")

# Get model predictions
with torch.no_grad():
    outputs = model(**inputs)

# Process outputs (assuming binary classification)
logits = outputs.logits
predictions = torch.sigmoid(logits)

# Print predictions
print(predictions)
Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.