Philipp-Sc
/

mistral-7b-reverse-instruct

Text Generation

Model card Files Files and versions Community

mistral-7b-reverse-instruct / README.md

Philipp-Sc's picture

Update README.md

fd34c15 7 months ago

|

No virus

2.37 kB

	---
	license: apache-2.0
	datasets:
	- pankajmathur/WizardLM_Orca
	language:
	- en
	pipeline_tag: text-generation
	---

	## Mistral 7b Reverse Instruct

	This LoRA Adapter is fine tuned to reverse engineer the original prompt of a given LLM output/response.

	- base_model: mistralai/Mistral-7B-v0.1 (=checkpoint-v1)
	- base_model: mistralai/Mistral-7B-v0.2 (>=checkpoint-v2)

	For convinience the latest model export is provided under /latest_model_export

	## Response Format

	"[INST]\n### System:\n{system}\n### Instruction:\n{instruction}\n[/INST]\n"

	- Grammar File: [inst_format.gbnf](https://huggingface.co/Philipp-Sc/mistral-7b-reverse-instruct/blob/main/inst_format.gbnf)


	## Prompt Template

	"\n### System:\nYou craft instructions for generating the given output through reverse engineering.\n### Instruction:\nDecipher the steps used to produce the given output and articulate a refined set of instructions (System & Instruction).\n### OUTPUT:\n {output}"

	(use the template without the " ")

	## Training Dataset

	About 21k items of the following datasets were used. (mostly coding-like tasks were removed)

	```bash
	wget https://raw.githubusercontent.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/main/data/alpaca_gpt4_data.json
	wget https://raw.githubusercontent.com/teknium1/GPTeacher/main/Roleplay%20Supplemental/roleplay-instruct-v2.1.json
	wget https://huggingface.co/datasets/pankajmathur/WizardLM_Orca/resolve/main/wizardlm_orca.json
	```

	## Training Procedure

	```bash
	CUDA_VISIBLE_DEVICES=0 WANDB_DISABLED=True python LLaMA-Factory/src/train_bash.py \
	--stage sft \
	--model_name_or_path model_name_or_path \
	--checkpoint_dir checkpoint_dir \
	--flash_attn \
	--shift_attn \
	--neftune_noise_alpha 5 \
	--do_train \
	--dataset default \
	--template vanilla \
	--finetuning_type lora \
	--lora_target q_proj,v_proj \
	--output_dir path_to_sft_checkpoint \
	--overwrite_cache \
	--per_device_train_batch_size 1 \
	--gradient_accumulation_steps 1 \
	--lr_scheduler_type cosine \
	--logging_steps 10 \
	--save_steps 100 \
	--learning_rate 5e-5 \
	--num_train_epochs 3.0 \
	--plot_loss \
	--fp16 \
	--overwrite_output_dir \
	--cutoff_len 2048 \
	--quantization_bit 4
	```

	## Training Time

	- v1: ~12h on Kaggle's P100 GPU
	- v2: >30h on Kaggle's T4 x2

	### Framework versions

	- LLaMA-Factory