Reynier commited on
Commit
f5bdbb9
·
verified ·
1 Parent(s): 540b9cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -1
README.md CHANGED
@@ -7,4 +7,34 @@ metrics:
7
  - recall
8
  base_model:
9
  - meta-llama/Meta-Llama-3-8B
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - recall
8
  base_model:
9
  - meta-llama/Meta-Llama-3-8B
10
+ ---# Llama3 8B Fine-Tuned for Domain Generation Algorithm Detection
11
+
12
+ This model is a fine-tuned version of Meta's Llama3 8B, specifically adapted for detecting **Domain Generation Algorithms (DGAs)**. DGAs are often used by malware to create dynamic domain names for command-and-control (C&C) servers, making them a critical challenge in cybersecurity.
13
+
14
+ ## Model Description
15
+
16
+ - **Base Model**: Llama3 8B
17
+ - **Task**: DGA Detection
18
+ - **Fine-Tuning Approach**: Supervised Fine-Tuning (SFT) with domain-specific data.
19
+ - **Dataset**: A custom dataset comprising 68 malware families and legitimate domains from the Tranco dataset, with a focus on both arithmetic and word-based DGAs.
20
+ - **Performance**:
21
+ - **Accuracy**: 94%
22
+ - **False Positive Rate (FPR)**: 4%
23
+ - Excels in detecting hard-to-identify word-based DGAs.
24
+
25
+ This model leverages the extensive semantic understanding of Llama3 to classify domains as either **malicious (DGA-generated)** or **legitimate** with high precision and recall.
26
+
27
+ ## How to Use
28
+
29
+ ```python
30
+ from transformers import AutoTokenizer, AutoModel
31
+
32
+ # Load the tokenizer and model
33
+ tokenizer = AutoTokenizer.from_pretrained("your-username/llama3-dga-detector")
34
+ model = AutoModel.from_pretrained("your-username/llama3-dga-detector")
35
+
36
+ # Example domain classification
37
+ domain = "example.com"
38
+ inputs = tokenizer(domain, return_tensors="pt")
39
+ outputs = model(**inputs)
40
+ # Process outputs to interpret classification