krishnas4415
/

log-anomaly-detection-models

+---
+license: mit
+tags:
+- log-analysis
+- anomaly-detection
+- bert
+- cybersecurity
+- multiclass-classification
+language:
+- en
+datasets:
+- custom-log-dataset
+metrics:
+- f1
+- accuracy
+pipeline_tag: text-classification
+---
+# Log Anomaly Detection Models
+This repository contains trained models for the **Log Anomaly Detection System** that classifies system logs into 7 anomaly categories.
+## 🤖 Available Models
+### BERT-based Models
+- **DANN-BERT** (`models/DANN-BERT-Log-Anomaly-Detection/`) - Domain-Adversarial Neural Network
+- **LoRA-BERT** (`models/LoRA-BERT-Log-Anomaly-Detection/`) - Low-Rank Adaptation
+- **Hybrid-BERT** (`models/Hybrid-BERT-Log-Anomaly-Detection/`) - BERT + Template Features
+### Traditional ML Models
+- **XGBoost** (`models/XGBoost-Log-Anomaly-Detection/`) - Gradient Boosting Classifier
+## 📊 Model Performance
+| Model | F1-Score (Macro) | Accuracy | Parameters |
+|-------|-----------------|----------|------------|
+| Hybrid-BERT | **92.8%** | **94.3%** | 110M |
+| DANN-BERT | 90.3% | 92.1% | 110M |
+| LoRA-BERT | 88.7% | 90.5% | 1.5M (trainable) |
+| XGBoost | 88.5% | 91.2% | - |
+## 🎯 Classification Categories
+1. **Normal** (0): Benign operations
+2. **Security Anomaly** (1): Authentication failures, unauthorized access
+3. **System Failure** (2): Crashes, kernel panics
+4. **Performance Issue** (3): Timeouts, slow responses
+5. **Network Anomaly** (4): Connection errors, packet loss
+6. **Config Error** (5): Misconfigurations, invalid settings
+7. **Hardware Issue** (6): Disk failures, memory errors
+## 🚀 Usage
+### Download Models
+```python
+from huggingface_hub import hf_hub_download
+# Download BERT model
+model_path = hf_hub_download(
+    repo_id="krishnas4415/log-anomaly-detection-models",
+    filename="models/Hybrid-BERT-Log-Anomaly-Detection/pytorch_model.pt"
+)
+# Download XGBoost model
+xgb_path = hf_hub_download(
+    repo_id="krishnas4415/log-anomaly-detection-models",
+    filename="models/XGBoost-Log-Anomaly-Detection/best_mod.pkl"
+)
+```
+### Load and Use Models
+```python
+import torch
+import pickle
+from transformers import AutoTokenizer
+# Load BERT model
+model = torch.load(model_path)
+tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
+# Load XGBoost model
+with open(xgb_path, 'rb') as f:
+    xgb_model = pickle.load(f)
+# Example prediction
+log_text = "Apr 15 12:34:56 server sshd[1234]: Failed password for admin"
+inputs = tokenizer(log_text, return_tensors='pt', max_length=128, truncation=True, padding=True)
+with torch.no_grad():
+    outputs = model(**inputs)
+    predictions = torch.softmax(outputs.logits, dim=-1)
+    predicted_class = torch.argmax(predictions, dim=-1)
+```
+## 📚 Training Data
+- **Sources**: 16 log types (Apache, SSH, Hadoop, HDFS, Linux, Windows, etc.)
+- **Size**: ~32,000 labeled logs
+- **Classes**: 7 anomaly categories
+- **Features**: BERT embeddings + template features + statistical features
+## 🔗 Related Links
+- **Main Project**: [Log Anomaly Detection System](https://github.com/krishnasharma4415/log-anomaly-detection)
+- **Live Demo**: [Frontend Application](https://log-anomaly-frontend.vercel.app)
+- **API**: [Backend API](https://log-anomaly-api.onrender.com)
+## 📄 Citation
+```bibtex
+@misc{log-anomaly-detection-2024,
+  title={Log Anomaly Detection System},
+  author={Krishna Sharma},
+  year={2024},
+  url={https://github.com/krishnasharma4415/log-anomaly-detection}
+}
+```
+## 📝 License
+MIT License - see LICENSE file for details.