sharpbai
/

alpaca-lora-7b-reproduced

Model card Files Files and versions Community

alpaca-lora-7b-reproduced / README.md

sharpbai's picture

Update README.md

d003444 over 1 year ago

|

history blame contribute delete

1.61 kB

	---
	license: mit
	datasets:
	- yahma/alpaca-cleaned
	---

	This repo reproduced [tloen/alpaca-lora-7b](https://huggingface.co/tloen/alpaca-lora-7b)
	fit on the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) dataset.

	4x H100 training for about 1h15min, details in [W&B link](https://wandb.ai/sharpbai/alpaca-lora-reproduce/runs/08ulvstd), there is a hyperparameter of val_set_size=500

	4 x 4090 training for about 4h35min, details in [W&B link](https://wandb.ai/sharpbai/alpaca-lora-reproduce/runs/ws16av1u), all key hyperparameters are the same

	To optimize the running speed, I change these code

	- `load_in_8bits=False` to use 16bit finetune
	- comment `model = prepare_model_for_int8_training` to not turn some parameters to fp32 and turn off gradient checkpointing
	- for 4090 enable gradient checkpointing, add `model.gradient_checkpointing_enable()` and `model.enable_input_require_grads()`

	This version of the weights was trained with the following hyperparameters:

	- Epochs: 10 (load from best epoch)
	- Batch size: 128
	- Cutoff length: 512
	- Learning rate: 3e-4
	- Lora _r_: 16
	- Lora target modules: q_proj, k_proj, v_proj, o_proj

	That is:

	```
	python finetune.py \
	--base_model='decapoda-research/llama-7b-hf' \
	--num_epochs=10 \
	--cutoff_len=512 \
	--group_by_length \
	--val_set_size=500 \
	--output_dir='./alpaca-lora-train-H100-80G-HBM3x4-mb8' \
	--lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' \
	--lora_r=16 \
	--micro_batch_size=8 \
	--train_in_8bit False
	```

	Instructions for running it can be found at https://github.com/tloen/alpaca-lora.