dzip/Llama3_8B_4Cybersecurity

Model Description

The training process for this cybersecurity application utilizes Meta-Llama-3-8B-Instruct as the base model, chosen for its strong adaptability to instruction-based tasks. The training was performed on an RTX 3090 GPU with 24GB of memory, in a Linux environment optimized for high-performance processing. To enhance computational efficiency, QLoRA (Quantized Low-Rank Adaptation) was employed as the supervised fine-tuning (SFT) method, utilizing 4-bit quantization via the "bitsandbytes" approach. This choice facilitated effective model adaptation with reduced memory requirements, while maintaining precision and essential for handling the complex data demands of cybersecurity. The details of this model are listed in my Github Repo：https://github.com/ddzipp/AutoAudit

Developed by: ddzipp from CUHKSZ SDS
Base Model: Llama3
Language(s) (NLP): English
Configuration:
method

stage: sft

finetuning_type: lora
dataset

dataset: alpaca_en_demo,cybersecurity

cutoff_len: 2048

max_samples: 1000

preprocessing_num_workers: 16

Model Description

method

dataset