habanoz commited on
Commit
2b961ba
1 Parent(s): 6a4a7ba

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -25,13 +25,13 @@ accelerate launch $BASE_DIR/qlora/train.py \
25
  --num_train_epochs 1 \
26
  --logging_steps 1 \
27
  --save_strategy steps \
28
- --save_steps 120 \
29
  --save_total_limit 2 \
30
  --data_seed 11422 \
31
  --evaluation_strategy steps \
32
  --per_device_eval_batch_size 4 \
33
  --eval_dataset_size 0.01 \
34
- --eval_steps 120 \
35
  --max_new_tokens 1024 \
36
  --dataloader_num_workers 3 \
37
  --logging_strategy steps \
@@ -47,16 +47,15 @@ accelerate launch $BASE_DIR/qlora/train.py \
47
  --dataset habanoz/airoboros-3.1-no-mathjson-max-1k \
48
  --dataset_format airoboros_chat \
49
  --model_max_len 1024 \
50
- --per_device_train_batch_size 1 \
51
- --gradient_accumulation_steps 16 \
52
  --learning_rate 1e-5 \
53
  --adam_beta2 0.999 \
54
  --max_grad_norm 0.3 \
55
  --lora_dropout 0.0 \
56
  --weight_decay 0.0 \
57
  --seed 11422 \
58
- --gradient_checkpointing False \
59
  --use_flash_attention_2 \
60
- --ddp_find_unused_parameters False \
61
- --trust_remote_code True
62
  ```
 
25
  --num_train_epochs 1 \
26
  --logging_steps 1 \
27
  --save_strategy steps \
28
+ --save_steps 75 \
29
  --save_total_limit 2 \
30
  --data_seed 11422 \
31
  --evaluation_strategy steps \
32
  --per_device_eval_batch_size 4 \
33
  --eval_dataset_size 0.01 \
34
+ --eval_steps 75 \
35
  --max_new_tokens 1024 \
36
  --dataloader_num_workers 3 \
37
  --logging_strategy steps \
 
47
  --dataset habanoz/airoboros-3.1-no-mathjson-max-1k \
48
  --dataset_format airoboros_chat \
49
  --model_max_len 1024 \
50
+ --per_device_train_batch_size 4 \
51
+ --gradient_accumulation_steps 4 \
52
  --learning_rate 1e-5 \
53
  --adam_beta2 0.999 \
54
  --max_grad_norm 0.3 \
55
  --lora_dropout 0.0 \
56
  --weight_decay 0.0 \
57
  --seed 11422 \
58
+ --gradient_checkpointing \
59
  --use_flash_attention_2 \
60
+ --ddp_find_unused_parameters False
 
61
  ```