metadata
license: mit
tags:
- log-analysis
- anomaly-detection
- bert
- cybersecurity
- multiclass-classification
language:
- en
datasets:
- custom-log-dataset
metrics:
- f1
- accuracy
pipeline_tag: text-classification
XGBoost-Log-Anomaly-Detection - Log Anomaly Detection
This model is part of the Log Anomaly Detection System that classifies system logs into 7 anomaly categories.
Model Description
XGBoost-Log-Anomaly-Detection is a XGBoost Classifier with BERT Features model fine-tuned for multi-class log anomaly detection. It can classify logs from 16+ different sources (Apache, SSH, Hadoop, etc.) into 7 categories:
- Normal (0): Benign operations
- Security Anomaly (1): Authentication failures, unauthorized access
- System Failure (2): Crashes, kernel panics
- Performance Issue (3): Timeouts, slow responses
- Network Anomaly (4): Connection errors, packet loss
- Config Error (5): Misconfigurations, invalid settings
- Hardware Issue (6): Disk failures, memory errors
Performance Metrics
- F1-Score (Macro): 0.885
- Accuracy: 0.912
- Model Type: XGBoost Classifier with BERT Features
- Classes: 7 (normal, security_anomaly, system_failure, performance_issue, network_anomaly, config_error, hardware_issue)
Usage
import torch
from transformers import AutoTokenizer, AutoModel
# Load the model
model = torch.load('model.pt')
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
# Example usage
log_text = "Apr 15 12:34:56 server sshd[1234]: Failed password for admin"
inputs = tokenizer(log_text, return_tensors='pt', max_length=128, truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.softmax(outputs.logits, dim=-1)
predicted_class = torch.argmax(predictions, dim=-1)
Training Data
- Sources: 16 log types (Apache, SSH, Hadoop, HDFS, Linux, Windows, etc.)
- Size: ~32,000 labeled logs
- Classes: 7 anomaly categories
- Features: BERT embeddings + template features + statistical features
Citation
@misc{log-anomaly-detection-2024,
title={Log Anomaly Detection System},
author={Krishna Sharma},
year={2024},
url={https://github.com/krishnasharma4415/log-anomaly-detection}
}
License
MIT License - see LICENSE file for details.