krishnas4415 commited on
Commit
d364315
Β·
verified Β·
1 Parent(s): e957c99

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +123 -0
README.md ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - log-analysis
5
+ - anomaly-detection
6
+ - bert
7
+ - cybersecurity
8
+ - multiclass-classification
9
+ language:
10
+ - en
11
+ datasets:
12
+ - custom-log-dataset
13
+ metrics:
14
+ - f1
15
+ - accuracy
16
+ pipeline_tag: text-classification
17
+ ---
18
+
19
+ # Log Anomaly Detection Models
20
+
21
+ This repository contains trained models for the **Log Anomaly Detection System** that classifies system logs into 7 anomaly categories.
22
+
23
+ ## πŸ€– Available Models
24
+
25
+ ### BERT-based Models
26
+ - **DANN-BERT** (`models/DANN-BERT-Log-Anomaly-Detection/`) - Domain-Adversarial Neural Network
27
+ - **LoRA-BERT** (`models/LoRA-BERT-Log-Anomaly-Detection/`) - Low-Rank Adaptation
28
+ - **Hybrid-BERT** (`models/Hybrid-BERT-Log-Anomaly-Detection/`) - BERT + Template Features
29
+
30
+ ### Traditional ML Models
31
+ - **XGBoost** (`models/XGBoost-Log-Anomaly-Detection/`) - Gradient Boosting Classifier
32
+
33
+ ## πŸ“Š Model Performance
34
+
35
+ | Model | F1-Score (Macro) | Accuracy | Parameters |
36
+ |-------|-----------------|----------|------------|
37
+ | Hybrid-BERT | **92.8%** | **94.3%** | 110M |
38
+ | DANN-BERT | 90.3% | 92.1% | 110M |
39
+ | LoRA-BERT | 88.7% | 90.5% | 1.5M (trainable) |
40
+ | XGBoost | 88.5% | 91.2% | - |
41
+
42
+ ## 🎯 Classification Categories
43
+
44
+ 1. **Normal** (0): Benign operations
45
+ 2. **Security Anomaly** (1): Authentication failures, unauthorized access
46
+ 3. **System Failure** (2): Crashes, kernel panics
47
+ 4. **Performance Issue** (3): Timeouts, slow responses
48
+ 5. **Network Anomaly** (4): Connection errors, packet loss
49
+ 6. **Config Error** (5): Misconfigurations, invalid settings
50
+ 7. **Hardware Issue** (6): Disk failures, memory errors
51
+
52
+ ## πŸš€ Usage
53
+
54
+ ### Download Models
55
+
56
+ ```python
57
+ from huggingface_hub import hf_hub_download
58
+
59
+ # Download BERT model
60
+ model_path = hf_hub_download(
61
+ repo_id="krishnas4415/log-anomaly-detection-models",
62
+ filename="models/Hybrid-BERT-Log-Anomaly-Detection/pytorch_model.pt"
63
+ )
64
+
65
+ # Download XGBoost model
66
+ xgb_path = hf_hub_download(
67
+ repo_id="krishnas4415/log-anomaly-detection-models",
68
+ filename="models/XGBoost-Log-Anomaly-Detection/best_mod.pkl"
69
+ )
70
+ ```
71
+
72
+ ### Load and Use Models
73
+
74
+ ```python
75
+ import torch
76
+ import pickle
77
+ from transformers import AutoTokenizer
78
+
79
+ # Load BERT model
80
+ model = torch.load(model_path)
81
+ tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
82
+
83
+ # Load XGBoost model
84
+ with open(xgb_path, 'rb') as f:
85
+ xgb_model = pickle.load(f)
86
+
87
+ # Example prediction
88
+ log_text = "Apr 15 12:34:56 server sshd[1234]: Failed password for admin"
89
+ inputs = tokenizer(log_text, return_tensors='pt', max_length=128, truncation=True, padding=True)
90
+
91
+ with torch.no_grad():
92
+ outputs = model(**inputs)
93
+ predictions = torch.softmax(outputs.logits, dim=-1)
94
+ predicted_class = torch.argmax(predictions, dim=-1)
95
+ ```
96
+
97
+ ## πŸ“š Training Data
98
+
99
+ - **Sources**: 16 log types (Apache, SSH, Hadoop, HDFS, Linux, Windows, etc.)
100
+ - **Size**: ~32,000 labeled logs
101
+ - **Classes**: 7 anomaly categories
102
+ - **Features**: BERT embeddings + template features + statistical features
103
+
104
+ ## πŸ”— Related Links
105
+
106
+ - **Main Project**: [Log Anomaly Detection System](https://github.com/krishnasharma4415/log-anomaly-detection)
107
+ - **Live Demo**: [Frontend Application](https://log-anomaly-frontend.vercel.app)
108
+ - **API**: [Backend API](https://log-anomaly-api.onrender.com)
109
+
110
+ ## πŸ“„ Citation
111
+
112
+ ```bibtex
113
+ @misc{log-anomaly-detection-2024,
114
+ title={Log Anomaly Detection System},
115
+ author={Krishna Sharma},
116
+ year={2024},
117
+ url={https://github.com/krishnasharma4415/log-anomaly-detection}
118
+ }
119
+ ```
120
+
121
+ ## πŸ“ License
122
+
123
+ MIT License - see LICENSE file for details.