--- base_model: unsloth/llama-3-8b-Instruct-bnb-4bit library_name: peft --- # Model Card for GTL-HIDS: Network Flow Analysis Model A fine-tuned LLaMA-3 model specialized in analyzing network traffic flows for intrusion detection and cybersecurity threat analysis. The model can classify network flows as either benign or malicious while providing detailed explanations of its analysis. ## Model Details ### Model Description - **Developed by:** Researchers at Rochester Institute of Technology - **Model type:** Large Language Model fine-tuned for network security analysis - **Language(s):** English - **License:** [Matching base model license] - **Finetuned from model:** unsloth/llama-3-8b-Instruct-bnb-4bit This model implements a Generative Tabular Learning-Enhanced Hybrid Intrusion Detection System (GTL-HIDS) approach, combining the pattern recognition capabilities of language models with structured network traffic analysis. ### Model Sources - **Repository:** [Repository URL] - **Paper:** Based on methodology from "TabLLM: Few-shot Classification of Tabular Data with Large Language Models" ## Uses ### Direct Use The model is designed for: - Network traffic flow analysis - Intrusion detection - Network security monitoring - Threat classification - Security incident analysis and explanation ### Downstream Use The model can be integrated into: - Network monitoring systems - Security information and event management (SIEM) systems - Automated security response systems - Network traffic analysis tools ### Out-of-Scope Use This model should not be used for: - Real-time network traffic processing without proper latency considerations - As the sole decision maker for security actions without human oversight - Processing of sensitive or personally identifiable information in network traffic - Analysis of encrypted traffic content ## Bias, Risks, and Limitations - The model may have biases towards patterns seen in the training data (NF-ToN-IoT dataset) - Limited to analyzing specific network flow features available in the training data - May not generalize well to novel attack types not present in training - Potential for false positives/negatives in attack detection - Not suitable for real-time analysis of high-volume traffic due to processing latency ### Recommendations - Use as part of a larger security infrastructure, not as a standalone solution - Implement human oversight for critical security decisions - Regularly update training data to include new attack patterns - Monitor performance metrics to detect potential biases or degradation - Validate results against other security tools ## How to Get Started with the Model ```python from unsloth import FastLanguageModel import torch # Load model model, tokenizer = FastLanguageModel.from_pretrained( "path_to_model", max_seq_length=2048, load_in_4bit=True ) # Format input messages = [ {"from": "human", "value": "Network flow description..."} ] inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" ).to("cuda") # Generate analysis outputs = model.generate( input_ids=inputs, max_new_tokens=128, use_cache=True ) ``` ## Training Details ### Training Data - Dataset: NF-ToN-IoT Network Flow Dataset - Balanced sampling of benign and attack traffic - Features include source/destination IPs, ports, protocol information, traffic metrics - Approximately 17.7 million flows (20% attack samples, 80% benign) ### Training Procedure #### Preprocessing - MinMax scaling of numerical features - IP address formatting into descriptive text - Conversion of network flows into structured text descriptions - Balanced sampling with max 10,000 samples per class #### Training Hyperparameters - **Training regime:** 4-bit quantization with LoRA - Batch size: 32 - Gradient accumulation steps: 4 - Learning rate: 2e-4 - Epochs: 3 - LoRA rank: 16 - LoRA alpha: 16 - Optimizer: AdamW 8-bit - LR scheduler: Cosine - Max sequence length: 2048 ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data - Hold-out test set from NF-ToN-IoT dataset - Balanced representation of attack types and benign traffic #### Metrics - Classification accuracy - AUC-ROC score - Precision/Recall per attack type - False positive/negative rates ## Environmental Impact - **Hardware Type:** NVIDIA GPU - **Hours used:** Approximately 14 hours for training - **Cloud Provider:** Local infrastructure - **Compute Region:** N/A - **Carbon Emitted:** [Not calculated] ## Technical Specifications ### Model Architecture and Objective - Base: LLaMA-3 8B parameter model - Adaptation: LoRA fine-tuning - Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - Training objective: Sequence classification with explanation generation ### Compute Infrastructure #### Hardware - GPU: NVIDIA GPU with at least 16GB VRAM - Memory: 32GB+ RAM recommended #### Software - Python 3.8+ - PyTorch - Unsloth library - Transformers library - PEFT 0.14.0 ## Model Card Authors Rochester Institute of Technology Research Team ## Model Card Contact [Contact Information Needed]