Reynier
/

Llama3_8B-DGA-Detector

Model card Files Files and versions Community

Reynier commited on Nov 19, 2024

Commit

f5bdbb9

·

verified ·

1 Parent(s): 540b9cd

Update README.md

Files changed (1) hide show

README.md +31 -1

README.md CHANGED Viewed

@@ -7,4 +7,34 @@ metrics:
 - recall
 base_model:
 - meta-llama/Meta-Llama-3-8B
----

 - recall
 base_model:
 - meta-llama/Meta-Llama-3-8B
+---# Llama3 8B Fine-Tuned for Domain Generation Algorithm Detection
+This model is a fine-tuned version of Meta's Llama3 8B, specifically adapted for detecting **Domain Generation Algorithms (DGAs)**. DGAs are often used by malware to create dynamic domain names for command-and-control (C&C) servers, making them a critical challenge in cybersecurity.
+## Model Description
+- **Base Model**: Llama3 8B
+- **Task**: DGA Detection
+- **Fine-Tuning Approach**: Supervised Fine-Tuning (SFT) with domain-specific data.
+- **Dataset**: A custom dataset comprising 68 malware families and legitimate domains from the Tranco dataset, with a focus on both arithmetic and word-based DGAs.
+- **Performance**:
+  - **Accuracy**: 94%
+  - **False Positive Rate (FPR)**: 4%
+  - Excels in detecting hard-to-identify word-based DGAs.
+This model leverages the extensive semantic understanding of Llama3 to classify domains as either **malicious (DGA-generated)** or **legitimate** with high precision and recall.
+## How to Use
+```python
+from transformers import AutoTokenizer, AutoModel
+# Load the tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained("your-username/llama3-dga-detector")
+model = AutoModel.from_pretrained("your-username/llama3-dga-detector")
+# Example domain classification
+domain = "example.com"
+inputs = tokenizer(domain, return_tensors="pt")
+outputs = model(**inputs)
+# Process outputs to interpret classification