gepardzik
/

LLama-3-8b-rogue-lora

Model card Files Files and versions Community

LLama-3-8b-rogue-lora / README.md

gepardzik's picture

Update README.md

4475b4b verified 10 months ago

|

history blame contribute delete

1.68 kB

	---
	language:
	- en
	license: apache-2.0
	base_model: unsloth/llama-3-8b-Instruct-bnb-4bit
	---

	LoRA for LLama-3-8b-Instruct, trained using dataset based on [toxicqa](https://huggingface.co/datasets/NobodyExistsOnTheInternet/toxicqa) and [toxic-dpo-v0.2](https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2). The model does not refuse to follow instructions, and may give provocative answers when asked about private views.

	# Usage

	Recommended prompt format: Alpaca

	Repository contains peft and gguf versions.
	Base model for peft version: [Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
	Base model for gguf version: [Meta-Llama-3-8B-Instruct-GGUF](https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF)

	Use [koboldcpp](https://github.com/LostRuins/koboldcpp) or [text-generation-webui](https://github.com/oobabooga/text-generation-webui) to run it.

	# Training parameters

	method: ORPO
	learning_rate: 1e-5
	train_batch_size: 4
	gradient_accumulation_steps: 4
	total_train_batch_size: 16
	optimizer: paged_adamw_8bit
	lr_scheduler_type: cosine
	lr_scheduler_warmup_steps: 100
	num_steps: 1200

	# Usage permission

	You may use the contents of the repository in any manner consistent with the license and applicable law.
	You are solely responsible for downloading and using the contents of the repository.
	Keep in mind that the content generated by the model does not refer in any way to the views of the author or those known to him.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)