SanjanaCodes's picture
Create README.md
d2a238c verified
|
raw
history blame
1.96 kB
Llama-3.1-8B-Instruct-Secure
Repository: SanjanaCodes/Llama-3.1-8B-Instruct-Secure
License: Add License Here
Languages: English (or specify other supported languages)
Base Model: Llama-3.1-8B (or specify if different)
Library Name: transformers, PyTorch (Add library used)
Pipeline Tag: text-generation
Model Description
The Llama-3.1-8B-Instruct-Secure is a fine-tuned variant of the Llama-3.1-8B model designed to address LLM security vulnerabilities while maintaining strong performance for instruction-based tasks. It is optimized to handle:
Secure Prompt Handling: Resistant to common jailbreak and adversarial attacks.
Instruction Following: Retains instruction-based generation accuracy.
Safety and Robustness: Improved safeguards against harmful or unsafe outputs.
Key Features:
Fine-tuned for secure instruction-based generation tasks.
Includes defense mechanisms against adversarial and jailbreaking prompts.
Pre-trained on a mixture of secure and adversarial datasets to generalize against threats.
Usage
Installation
bash
Copy code
pip install transformers torch
Example
python
Copy code
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "SanjanaCodes/Llama-3.1-8B-Instruct-Secure"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example Input
input_text = "Explain the importance of cybersecurity in simple terms."
inputs = tokenizer(input_text, return_tensors="pt")
# Generate Response
output = model.generate(**inputs, max_length=150)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training Details
Dataset
Fine-tuned on a curated dataset with:
Instruction-following data.
Security-focused prompts.
Adversarial prompts for robustness.
Training Procedure
Framework: PyTorch
Hardware: GPU-enabled nodes
Optimization Techniques:
Mixed Precision Training
Gradient Checkpointing
Evaluation Metrics: Attack Success Rate (ASR), Robustness Score