sharpbai commited on
Commit
d003444
·
1 Parent(s): 52e6752

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -2
README.md CHANGED
@@ -7,8 +7,15 @@ datasets:
7
  This repo reproduced [tloen/alpaca-lora-7b](https://huggingface.co/tloen/alpaca-lora-7b)
8
  fit on the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) dataset.
9
 
10
- The training log is in [W&B link](https://wandb.ai/sharpbai/alpaca-lora-reproduce/runs/08ulvstd),
11
- 4x H100 training for about 1h15min
 
 
 
 
 
 
 
12
 
13
  This version of the weights was trained with the following hyperparameters:
14
 
 
7
  This repo reproduced [tloen/alpaca-lora-7b](https://huggingface.co/tloen/alpaca-lora-7b)
8
  fit on the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) dataset.
9
 
10
+ 4x H100 training for about 1h15min, details in [W&B link](https://wandb.ai/sharpbai/alpaca-lora-reproduce/runs/08ulvstd), there is a hyperparameter of val_set_size=500
11
+
12
+ 4 x 4090 training for about 4h35min, details in [W&B link](https://wandb.ai/sharpbai/alpaca-lora-reproduce/runs/ws16av1u), all key hyperparameters are the same
13
+
14
+ To optimize the running speed, I change these code
15
+
16
+ - `load_in_8bits=False` to use 16bit finetune
17
+ - comment `model = prepare_model_for_int8_training` to not turn some parameters to fp32 and turn off gradient checkpointing
18
+ - for 4090 enable gradient checkpointing, add `model.gradient_checkpointing_enable()` and `model.enable_input_require_grads()`
19
 
20
  This version of the weights was trained with the following hyperparameters:
21