bigscience-bot commited on
Commit
a69a7a4
1 Parent(s): 3d3696f
Files changed (1) hide show
  1. logs/main_log.txt +7 -0
logs/main_log.txt CHANGED
@@ -48315,3 +48315,10 @@ time (ms)
48315
  --------------------------------------------------------------------------------------------------
48316
  iteration 134200/ 152972 | consumed samples: 63630784 | elapsed time per iteration (ms): 6792.7 | learning rate: 1.823E-05 | global batch size: 512 | lm loss: 2.751855E+00 | loss scale: 524288.0 | grad norm: 49151.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
48317
  time (ms)
 
 
 
 
 
 
 
48315
  --------------------------------------------------------------------------------------------------
48316
  iteration 134200/ 152972 | consumed samples: 63630784 | elapsed time per iteration (ms): 6792.7 | learning rate: 1.823E-05 | global batch size: 512 | lm loss: 2.751855E+00 | loss scale: 524288.0 | grad norm: 49151.114 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
48317
  time (ms)
48318
+ iteration 134400/ 152972 | consumed samples: 63733184 | elapsed time per iteration (ms): 5936.0 | learning rate: 1.806E-05 | global batch size: 512 | lm loss: 2.751402E+00 | loss scale: 524288.0 | grad norm: 50151.684 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
48319
+ time (ms)
48320
+ saving checkpoint at iteration 134528 to /gpfsscratch/rech/six/commun/synched_exps/tr4c-1B3-rotary-oscar/checkpoints
48321
+ [2021-10-07 05:50:05,002] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/synched_exps/tr4c-1B3-rotary-oscar/checkpoints/global_step134528/mp_rank_00_model_states.pt
48322
+ successfully saved checkpoint at iteration 134528 to /gpfsscratch/rech/six/commun/synched_exps/tr4c-1B3-rotary-oscar/checkpoints
48323
+ time (ms) | save-checkpoint: 1703.80
48324
+ [exiting program after 1190.076720392704 minutes] datetime: 2021-10-07 05:50:06