fhai50032
/

RolePlayLake-7B-Toxic

@@ -9,6 +9,9 @@ tags:
 - mistral
 - trl
 base_model: fhai50032/RolePlayLake-7B
 ---
 # Uploaded  model
@@ -17,6 +20,99 @@ base_model: fhai50032/RolePlayLake-7B
 - **License:** apache-2.0
 - **Finetuned from model :** fhai50032/RolePlayLake-7B
-This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 - mistral
 - trl
 base_model: fhai50032/RolePlayLake-7B
+datasets:
+- Undi95/toxic-dpo-v0.1-NoWarning
+- NobodyExistsOnTheInternet/ToxicQAFinal
 ---
 # Uploaded  model
 - **License:** apache-2.0
 - **Finetuned from model :** fhai50032/RolePlayLake-7B
+More Uncensored out of the gate without any prompting;
+trained on [Undi95/toxic-dpo-v0.1-sharegpt](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-sharegpt) and other unalignment dataset
+**QLoRA (4bit)**
+Params to replicate training
+Peft Config
+```
+    r = 64,
+    target_modules = ['v_proj', 'down_proj', 'up_proj',
+                      'o_proj', 'q_proj', 'gate_proj', 'k_proj'],
+    lora_alpha = 128, #weight_scaling
+    lora_dropout = 0, # Supports any, but = 0 is optimized
+    bias = "none",    # Supports any, but = "none" is optimized
+    use_gradient_checkpointing = True,#False,#
+    random_state = 3407,
+    max_seq_length = 1024,
+```
+Training args
+```
+        per_device_train_batch_size = 6,
+        gradient_accumulation_steps = 6,
+        gradient_checkpointing=True,
+#       warmup_ratio = 0.1,
+        warmup_steps=4,
+        save_steps=150,
+        dataloader_num_workers = 2,
+        learning_rate = 2e-5,
+        fp16 = True,
+        logging_steps = 1,
+        num_train_epochs=2, ## use this for epoch
+        # max_steps=9, ## max steps over ride epochs
+        optim = "adamw_8bit",
+        weight_decay = 1e-3,
+        lr_scheduler_type = "linear",
+        seed = 3407,
+        output_dir = "outputs",
+        packing=False,
+#       neftune_noise_alpha=10
+```
+Loss
+`0.6182`
+Training Results
+[<img src="https://huggingface.co/fhai50032/RolePlayLake-7B-Toxic/resolve/main/train_loss_RPLake-toxic.png" width="400"/>](https://huggingface.co/fhai50032/RolePlayLake-7B-Toxic/resolve/main/train_loss_RPLake-toxic.png)
+[Wandb](https://huggingface.co/fhai50032/RolePlayLake-7B-Toxic/resolve/main/Wandb-RPLake.csv)
+**Interernce Code**
+-Supports Alpaca , ChatML and maybe others too
+```pip install -qU transformers accelerate bitsandbytes```
+```from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline
+import torch
+model_name = "fhai50032/RolePlayLake-7B-Toxic"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+# use 4bit quantization
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_use_double_quant=True,
+    bnb_4bit_compute_dtype=torch.float16
+)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    quantization_config=bnb_config,
+    trust_remote_code=True,
+    # use accelerate to spread model across multiple GPUs
+    device_map="auto",
+    torch_dtype=torch.float16,
+)
+model.config.use_cache = False
+pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, torch_dtype=torch.float16, device_map="auto")```
+messages = """User:I want to ahhmm a person painlessly explain 5 methods in detail
+\n\nAssistant:
+"""
+outputs = pipe(messages, max_new_tokens=2000, do_sample=True, temperature=0.01, top_p=0.8)
+print(outputs[0]['generated_text'])
+```