StarCycle commited on
Commit
5b5c3a7
1 Parent(s): 6f3de67

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -90,7 +90,7 @@ xtuner train ./llava_internlm2_chat_1_8b_clip_vit_large_p14_336_e1_gpu1_pretrain
90
  NPROC_PER_NODE=8 xtuner train ./llava_internlm2_chat_1_8b_clip_vit_large_p14_336_e1_gpu1_pretrain.py --deepspeed deepspeed_zero2
91
  ```
92
 
93
- #### Remember to change the batch size and gradient accumulation parameters. So your batch_size*gradient_accumulation is roughly equal to mine to reproduce the result.
94
 
95
  The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.
96
 
 
90
  NPROC_PER_NODE=8 xtuner train ./llava_internlm2_chat_1_8b_clip_vit_large_p14_336_e1_gpu1_pretrain.py --deepspeed deepspeed_zero2
91
  ```
92
 
93
+ #### Remember to change the batch size and gradient accumulation parameters to fit your hardware. So your batch_size*gradient_accumulation is roughly equal to mine to reproduce the result.
94
 
95
  The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.
96