Model Card for GTL-HIDS: Network Flow Analysis Model

A fine-tuned LLaMA-3 model specialized in analyzing network traffic flows for intrusion detection and cybersecurity threat analysis. The model can classify network flows as either benign or malicious while providing detailed explanations of its analysis.

Model Details

Model Description

Developed by: Researchers at Rochester Institute of Technology
Model type: Large Language Model fine-tuned for network security analysis
Language(s): English
License: [Matching base model license]
Finetuned from model: unsloth/llama-3-8b-Instruct-bnb-4bit

This model implements a Generative Tabular Learning-Enhanced Hybrid Intrusion Detection System (GTL-HIDS) approach, combining the pattern recognition capabilities of language models with structured network traffic analysis.

Model Sources

Repository: [Repository URL]
Paper: Based on methodology from "TabLLM: Few-shot Classification of Tabular Data with Large Language Models"

Uses

Direct Use

The model is designed for:

Network traffic flow analysis
Intrusion detection
Network security monitoring
Threat classification
Security incident analysis and explanation

Downstream Use

The model can be integrated into:

Network monitoring systems
Security information and event management (SIEM) systems
Automated security response systems
Network traffic analysis tools

Out-of-Scope Use

This model should not be used for:

Real-time network traffic processing without proper latency considerations
As the sole decision maker for security actions without human oversight
Processing of sensitive or personally identifiable information in network traffic
Analysis of encrypted traffic content

Bias, Risks, and Limitations

The model may have biases towards patterns seen in the training data (NF-ToN-IoT dataset)
Limited to analyzing specific network flow features available in the training data
May not generalize well to novel attack types not present in training
Potential for false positives/negatives in attack detection
Not suitable for real-time analysis of high-volume traffic due to processing latency

Recommendations

Use as part of a larger security infrastructure, not as a standalone solution
Implement human oversight for critical security decisions
Regularly update training data to include new attack patterns
Monitor performance metrics to detect potential biases or degradation
Validate results against other security tools

How to Get Started with the Model

from unsloth import FastLanguageModel
import torch

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    "path_to_model",
    max_seq_length=2048,
    load_in_4bit=True
)

# Format input
messages = [
    {"from": "human", "value": "Network flow description..."}
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to("cuda")

# Generate analysis
outputs = model.generate(
    input_ids=inputs,
    max_new_tokens=128,
    use_cache=True
)

Training Details

Training Data

Dataset: NF-ToN-IoT Network Flow Dataset
Balanced sampling of benign and attack traffic
Features include source/destination IPs, ports, protocol information, traffic metrics
- Approximately 17.7 million flows (20% attack samples, 80% benign)

Training Procedure

Preprocessing

MinMax scaling of numerical features
IP address formatting into descriptive text
Conversion of network flows into structured text descriptions
Balanced sampling with max 10,000 samples per class

Training Hyperparameters

Training regime: 4-bit quantization with LoRA
Batch size: 32
Gradient accumulation steps: 4
Learning rate: 2e-4
Epochs: 3
LoRA rank: 16
LoRA alpha: 16
Optimizer: AdamW 8-bit
LR scheduler: Cosine
Max sequence length: 2048

Evaluation

Testing Data, Factors & Metrics

Testing Data

Hold-out test set from NF-ToN-IoT dataset
Balanced representation of attack types and benign traffic

Metrics

Classification accuracy
AUC-ROC score
Precision/Recall per attack type
False positive/negative rates

Environmental Impact

Hardware Type: NVIDIA GPU
Hours used: Approximately 14 hours for training
Cloud Provider: Local infrastructure
Compute Region: N/A
Carbon Emitted: [Not calculated]

Technical Specifications

Model Architecture and Objective

Base: LLaMA-3 8B parameter model
Adaptation: LoRA fine-tuning
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training objective: Sequence classification with explanation generation

Compute Infrastructure

Hardware

GPU: NVIDIA GPU with at least 16GB VRAM
Memory: 32GB+ RAM recommended

Software

Python 3.8+
PyTorch
Unsloth library
Transformers library
PEFT 0.14.0

Model Card Authors

Rochester Institute of Technology Research Team

Model Card Contact

[Contact Information Needed]

Hmehdi515
/

gtl-hids