Edit model card

Model Information

model_name = "NousResearch/Llama-2-7b-chat-hf"

dataset_name = "synapsecai/synthetic-sensitive-information"

QLoRA parameters

lora_r = 32

lora_alpha = 8

lora_dropout = 0.1

BitsAndBytes parameters

use_4bit = True

bnb_4bit_compute_dtype = "float16"

bnb_4bit_quant_type = "nf4"

use_nested_quant = False

Training Arguments parameters

num_train_epochs = 1

fp16 = False

bf16 = False

per_device_train_batch_size = 32

per_device_eval_batch_size = 8

gradient_accumulation_steps = 4

gradient_checkpointing = True

max_grad_norm = 0.3

learning_rate = 2e-4

weight_decay = 0.001

optim = "paged_adamw_32bit"

lr_scheduler_type = "cosine"

max_steps = -1

warmup_ratio = 0.03

group_by_length = True

save_steps = 0

logging_steps = 25

SFT parameters

max_seq_length = None

packing = False This model is an ethically fine-tuned version of Llama 2, specifically trained to detect and flag private or sensitive information within natural text. It serves as a powerful tool for data privacy and security, capable of identifying potentially vulnerable data such as:

API keys Personally Identifiable Information (PII) Financial data Confidential business information Login credentials

Key Features:

Analyzes natural language input to identify sensitive content Provides explanations for detected sensitive information Helps prevent accidental exposure of private data Supports responsible data handling practices

Use Cases:

Content moderation Data loss prevention Compliance checks for GDPR, HIPAA, etc. Security audits of text-based communications

This model aims to enhance data protection measures and promote ethical handling of sensitive information in various applications and industries.

Downloads last month
0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Sharpaxis/Llama-2-7_Ethical_Guardian