bigscience-bot
commited on
Commit
•
aa95653
1
Parent(s):
d7c12a4
new data
Browse files- logs/main_log.txt +6 -0
logs/main_log.txt
CHANGED
@@ -20189,3 +20189,9 @@ saving checkpoint at iteration 54000 to /gpfsscratch/rech/six/commun/synched_e
|
|
20189 |
[2021-10-01 00:42:05,094] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/synched_exps/tr4c-1B3-rotary-oscar/checkpoints/global_step54000/mp_rank_00_model_states.pt
|
20190 |
successfully saved checkpoint at iteration 54000 to /gpfsscratch/rech/six/commun/synched_exps/tr4c-1B3-rotary-oscar/checkpoints
|
20191 |
time (ms) | save-checkpoint: 1447.28
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20189 |
[2021-10-01 00:42:05,094] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: /gpfsscratch/rech/six/commun/synched_exps/tr4c-1B3-rotary-oscar/checkpoints/global_step54000/mp_rank_00_model_states.pt
|
20190 |
successfully saved checkpoint at iteration 54000 to /gpfsscratch/rech/six/commun/synched_exps/tr4c-1B3-rotary-oscar/checkpoints
|
20191 |
time (ms) | save-checkpoint: 1447.28
|
20192 |
+
iteration 54200/ 152972 | consumed samples: 22670784 | elapsed time per iteration (ms): 6964.2 | learning rate: 1.591E-04 | global batch size: 512 | lm loss: 2.897913E+00 | loss scale: 1048576.0 | grad norm: 97564.150 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
20193 |
+
time (ms)
|
20194 |
+
iteration 54400/ 152972 | consumed samples: 22773184 | elapsed time per iteration (ms): 6054.7 | learning rate: 1.588E-04 | global batch size: 512 | lm loss: 2.895984E+00 | loss scale: 524288.0 | grad norm: 44454.901 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
20195 |
+
time (ms)
|
20196 |
+
iteration 54600/ 152972 | consumed samples: 22875584 | elapsed time per iteration (ms): 6064.5 | learning rate: 1.584E-04 | global batch size: 512 | lm loss: 2.897962E+00 | loss scale: 524288.0 | grad norm: 51173.911 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 |
|
20197 |
+
time (ms)
|