Angainor commited on
Commit
ba6030e
1 Parent(s): 6b7dd5a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -9,24 +9,24 @@ This repo contains a low-rank adapter for LLaMA-13b fit on the Stanford Alpaca d
9
 
10
  This version of the weights was trained on dual RTX3090 with the following hyperparameters:
11
 
12
- Epochs: 10
13
- Batch size: 128
14
- Cutoff length: 256
15
- Learning rate: 3e-4
16
- Lora r: 16
17
- Lora alpha: 16
18
- Lora target modules: q_proj, k_proj, v_proj, o_proj
19
  That is:
20
 
21
- OMP_NUM_THREADS=4 WORLD_SIZE=2 CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --master_port=1234 finetune.py \
22
- --base_model='decapoda-research/llama-13b-hf' \
23
- --data_path="yahma/alpaca-cleaned' \
24
- --num_epochs=10 \
25
- --output_dir='./lora-alpaca-13b-256-qkvo' \
26
- --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' \
27
- --lora_r=16 \
28
- --val_set_size=0 \
29
- --micro_batch_size=32
30
 
31
  LR warmup was tuned to fit the first epoch.
32
 
 
9
 
10
  This version of the weights was trained on dual RTX3090 with the following hyperparameters:
11
 
12
+ Epochs: 10
13
+ Batch size: 128
14
+ Cutoff length: 256
15
+ Learning rate: 3e-4
16
+ Lora r: 16
17
+ Lora alpha: 16
18
+ Lora target modules: q_proj, k_proj, v_proj, o_proj
19
  That is:
20
 
21
+ OMP_NUM_THREADS=4 WORLD_SIZE=2 CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --master_port=1234 finetune.py \
22
+ --base_model='decapoda-research/llama-13b-hf' \
23
+ --data_path="yahma/alpaca-cleaned' \
24
+ --num_epochs=10 \
25
+ --output_dir='./lora-alpaca-13b-256-qkvo' \
26
+ --lora_target_modules='[q_proj,k_proj,v_proj,o_proj]' \
27
+ --lora_r=16 \
28
+ --val_set_size=0 \
29
+ --micro_batch_size=32
30
 
31
  LR warmup was tuned to fit the first epoch.
32