StarCycle commited on
Commit
caa8961
1 Parent(s): 6dc926d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -204,20 +204,20 @@ NPROC_PER_NODE=4 xtuner train ./pretrain.py --deepspeed deepspeed_zero2
204
  The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.
205
 
206
  This is my loss curve for llava-siglip-internlm2-1_8b-pretrain-v1:
207
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/iNxPxfOvSJq8ZPz8uP_sP.png)
208
 
209
  And the learning rate curve:
210
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/U1U9Kapcd6AIEUySvt2RS.png)
211
 
212
  2. Instruction following fine-tuning
213
  ```
214
  NPROC_PER_NODE=4 xtuner train ./finetune.py --deepspeed deepspeed_zero2
215
  ```
216
  Here is my loss curve (the curve fluctuates strongly because the batch size is small, and I only record batch loss instead of epoch loss):
217
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/kby2Y1dixeTaALliZ4pJa.png)
218
 
219
  And the learning rate curve:
220
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/7ue98bikCOU7ub2jEHrom.png)
221
 
222
  ## Transfer the checkpoints to Huggingface safetensor format
223
  ```
 
204
  The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.
205
 
206
  This is my loss curve for llava-siglip-internlm2-1_8b-pretrain-v1:
207
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/geoWP80yE5wzG1e6ZJTEy.png)
208
 
209
  And the learning rate curve:
210
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/hy8ulNnvy1Y7fE1ZNnHRN.png)
211
 
212
  2. Instruction following fine-tuning
213
  ```
214
  NPROC_PER_NODE=4 xtuner train ./finetune.py --deepspeed deepspeed_zero2
215
  ```
216
  Here is my loss curve (the curve fluctuates strongly because the batch size is small, and I only record batch loss instead of epoch loss):
217
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/IZVjtlw4zPw-61p8dT8nL.png)
218
 
219
  And the learning rate curve:
220
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/642a298ae5f33939cf3ee600/81VD13-zwFsYqkfUyyntJ.png)
221
 
222
  ## Transfer the checkpoints to Huggingface safetensor format
223
  ```