--- license: apache-2.0 datasets: - pankajmathur/WizardLM_Orca language: - en pipeline_tag: text-generation --- --- base_model: mistralai/Mistral-7B-v0.1 (=checkpoint-v1) base_model: mistralai/Mistral-7B-v0.2 (>=checkpoint-v2) --- ## Reverse Instruct LoRa Adapter This LoRa Adapter is fine tuned to reverse engineer the original prompt of a given LLM output/response. ## Response Format "[INST]\n### System:\n{system}\n### Instruction:\n{instruction}\n[/INST]\n" (without the "") ## Prompt Template "\n### System:\nYou craft instructions for generating the given output through reverse engineering.\n### Instruction:\nDecipher the steps used to produce the given output and articulate a refined set of instructions (System & Instruction).\n### OUTPUT:\n {output}" (use the template without the " ") ## Training Dataset About 21k items of the following datasets were used. (mostly coding-like tasks were removed) ```bash wget https://raw.githubusercontent.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/main/data/alpaca_gpt4_data.json wget https://raw.githubusercontent.com/teknium1/GPTeacher/main/Roleplay%20Supplemental/roleplay-instruct-v2.1.json wget https://huggingface.co/datasets/pankajmathur/WizardLM_Orca/resolve/main/wizardlm_orca.json ``` ## Training Procedure ```bash CUDA_VISIBLE_DEVICES=0 WANDB_DISABLED=True python LLaMA-Factory/src/train_bash.py \ --stage sft \ --model_name_or_path model_name_or_path \ --checkpoint_dir checkpoint_dir \ --do_train \ --dataset default \ --template vanilla \ --finetuning_type lora \ --lora_target q_proj,v_proj \ --output_dir path_to_sft_checkpoint \ --overwrite_cache \ --per_device_train_batch_size 1 \ --gradient_accumulation_steps 1 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 100 \ --learning_rate 5e-5 \ --num_train_epochs 3.0 \ --plot_loss \ --fp16 \ --overwrite_output_dir \ --cutoff_len 2048 \ --quantization_bit 4 ``` ## Training Time - ~30h on Kaggle's P100 GPU ### Framework versions - LLaMA-Factory