SanjanaCodes
/

Llama-3.1-8b-Instruct-Secure

Model card Files Files and versions Community

Llama-3.1-8b-Instruct-Secure / README.md

SanjanaCodes's picture

Create README.md

d2a238c verified 3 days ago

|

1.96 kB

	Llama-3.1-8B-Instruct-Secure
	Repository: SanjanaCodes/Llama-3.1-8B-Instruct-Secure
	License: Add License Here
	Languages: English (or specify other supported languages)
	Base Model: Llama-3.1-8B (or specify if different)
	Library Name: transformers, PyTorch (Add library used)
	Pipeline Tag: text-generation

	Model Description
	The Llama-3.1-8B-Instruct-Secure is a fine-tuned variant of the Llama-3.1-8B model designed to address LLM security vulnerabilities while maintaining strong performance for instruction-based tasks. It is optimized to handle:

	Secure Prompt Handling: Resistant to common jailbreak and adversarial attacks.
	Instruction Following: Retains instruction-based generation accuracy.
	Safety and Robustness: Improved safeguards against harmful or unsafe outputs.
	Key Features:
	Fine-tuned for secure instruction-based generation tasks.
	Includes defense mechanisms against adversarial and jailbreaking prompts.
	Pre-trained on a mixture of secure and adversarial datasets to generalize against threats.
	Usage
	Installation
	bash
	Copy code
	pip install transformers torch
	Example
	python
	Copy code
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "SanjanaCodes/Llama-3.1-8B-Instruct-Secure"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name)

	# Example Input
	input_text = "Explain the importance of cybersecurity in simple terms."
	inputs = tokenizer(input_text, return_tensors="pt")

	# Generate Response
	output = model.generate(**inputs, max_length=150)
	print(tokenizer.decode(output[0], skip_special_tokens=True))
	Training Details
	Dataset
	Fine-tuned on a curated dataset with:
	Instruction-following data.
	Security-focused prompts.
	Adversarial prompts for robustness.
	Training Procedure
	Framework: PyTorch
	Hardware: GPU-enabled nodes
	Optimization Techniques:
	Mixed Precision Training
	Gradient Checkpointing
	Evaluation Metrics: Attack Success Rate (ASR), Robustness Score