|
--- |
|
language: |
|
- en |
|
license: apache-2.0 |
|
base_model: unsloth/llama-3-8b-Instruct-bnb-4bit |
|
--- |
|
|
|
LoRA for LLama-3-8b-Instruct, trained using dataset based on [toxicqa](https://huggingface.co/datasets/NobodyExistsOnTheInternet/toxicqa) and [toxic-dpo-v0.2](https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2). The model does not refuse to follow instructions, and may give provocative answers when asked about private views. |
|
|
|
# Usage |
|
|
|
Recommended prompt format: Alpaca |
|
|
|
Repository contains peft and gguf versions. |
|
Base model for peft version: [Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) |
|
Base model for gguf version: [Meta-Llama-3-8B-Instruct-GGUF](https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF) |
|
|
|
Use [koboldcpp](https://github.com/LostRuins/koboldcpp) or [text-generation-webui](https://github.com/oobabooga/text-generation-webui) to run it. |
|
|
|
# Training parameters |
|
|
|
method: ORPO |
|
learning_rate: 1e-5 |
|
train_batch_size: 4 |
|
gradient_accumulation_steps: 4 |
|
total_train_batch_size: 16 |
|
optimizer: paged_adamw_8bit |
|
lr_scheduler_type: cosine |
|
lr_scheduler_warmup_steps: 100 |
|
num_steps: 1200 |
|
|
|
# Usage permission |
|
|
|
You may use the contents of the repository in any manner consistent with the license and applicable law. |
|
You are solely responsible for downloading and using the contents of the repository. |
|
Keep in mind that the content generated by the model does not refer in any way to the views of the author or those known to him. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |
|
|