--- base_model: unsloth/Mistral-Nemo-Instruct-2407 language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - mistral - trl --- # Uploaded model - **Developed by:** UsernameJustAnother - **License:** apache-2.0 - **Finetuned from model :** unsloth/Mistral-Nemo-Instruct-2407 Experimental RP Finetune with secret sauce dataset, rsLoRA, r = 64, on an Colab A100 instance. 30GB vRAM used, 2 epochs ~ 3hrs of training. ``` r = 64, target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj",], lora_alpha = 64, lora_dropout = 0, # Supports any, but = 0 is optimized bias = "none", # Supports any, but = "none" is optimized use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context random_state = 3407, use_rslora = True, # lora_alpha --> 8 loftq_config = None, per_device_train_batch_size = 2, gradient_accumulation_steps = 4, warmup_steps = 5, num_train_epochs = 2, learning_rate = 2e-5, # down from 2e-4, could go down to (5e-5 then 1e-5) fp16 = not is_bfloat16_supported(), bf16 = is_bfloat16_supported(), logging_steps = 1, optim = "adamw_8bit", weight_decay = 0.01, lr_scheduler_type = "linear", seed = 3407, ``` This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth)