LoRA for LLama-3-8b-Instruct, trained using dataset based on toxicqa and toxic-dpo-v0.2. The model does not refuse to follow instructions, and may give provocative answers when asked about private views.

Usage

Recommended prompt format: Alpaca

Repository contains peft and gguf versions.
Base model for peft version: Meta-Llama-3-8B-Instruct
Base model for gguf version: Meta-Llama-3-8B-Instruct-GGUF

Use koboldcpp or text-generation-webui to run it.

Training parameters

method: ORPO
learning_rate: 1e-5
train_batch_size: 4
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: paged_adamw_8bit
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 100
num_steps: 1200

Usage permission

You may use the contents of the repository in any manner consistent with the license and applicable law.
You are solely responsible for downloading and using the contents of the repository.
Keep in mind that the content generated by the model does not refer in any way to the views of the author or those known to him.

gepardzik
/

LLama-3-8b-rogue-lora

Usage

Training parameters

Usage permission

Model tree for gepardzik/LLama-3-8b-rogue-lora