Model parameters: d_model 2304 ffw_size 9216 kv_size 128 n_heads 18 n_layers 32 Megatron-DeepSpeed/pretrain_gpt.py --tensor-model-parallel-size 1 --pipeline-model-parallel-size 1 --num-layers 32 --hidden-size 2304 --num-attention-heads 18 --kv-channels 128 --ffn-hidden-size 9216 --seq-length 2048 --max-position-embeddings 2048 --micro-batch-size 2 --global-batch-size 512 --train-samples 22_565_693 --vocab-file gpt2/vocab.json --merge-file gpt2/merges.txt --clip-grad 1.0 --kill-switch-path kill-switch-2b2 --bf16 --optimizer adam --adam-beta1 0.9 --adam-beta2 0.999 --adam-eps 1e-8 --lr 2e-4 --min-lr 2e-5 --lr-decay-style cosine --lr-decay-samples 22_565_693 --lr-warmup-samples 225_657 --clip-grad 1.0 --weight-decay 1e-1 --log-interval 10 --save-interval 1000 --eval-interval 1000 --eval-iters 1 --tensorboard-dir tensorboard_2b2 --tensorboard-queue-size 5 --log-timers-to-tensorboard --log-batch-size-to-tensorboard --log-validation-ppl-to-tensorboard --save checkpoints_2b2 --load checkpoints_2b2 --data-path /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document --data-impl mmap --split 949,50,1 --deepspeed --deepspeed_config ds_configs/2076210.json --zero-stage 0 START 2076210: Sun Nov 27 20:40:07 EET 2022 0: 0: 0: ======================= ROCm System Management Interface ======================= 0: ================================= Concise Info ================================= 0: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 0: 0 41.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 1 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: 2 39.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: 4 42.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 5 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: 6 41.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 0: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 0: ================================================================================ 0: ============================= End of ROCm SMI Log ============================== 4: 4: 4: ======================= ROCm System Management Interface ======================= 4: ================================= Concise Info ================================= 4: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 4: 0 39.0c 94.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 1 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: 2 44.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 3 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: 4 45.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 5 41.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: 6 41.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 4: 7 38.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 4: ================================================================================ 4: ============================= End of ROCm SMI Log ============================== 6: 6: 6: ======================= ROCm System Management Interface ======================= 6: ================================= Concise Info ================================= 6: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 6: 0 42.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 1 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: 2 40.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 3 42.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: 4 46.0c 81.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 5 50.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: 6 43.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 6: 7 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 6: ================================================================================ 6: ============================= End of ROCm SMI Log ============================== 1: 1: 1: ======================= ROCm System Management Interface ======================= 1: ================================= Concise Info ================================= 1: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 1: 0 40.0c 97.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 1 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: 2 40.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 3 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: 4 42.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 5 48.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: 6 41.0c 93.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 1: 7 43.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 1: ================================================================================ 1: ============================= End of ROCm SMI Log ============================== 2: 2: 2: ======================= ROCm System Management Interface ======================= 2: ================================= Concise Info ================================= 2: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 2: 0 40.0c 99.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 1 41.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: 2 39.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 3 38.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: 4 42.0c 85.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 5 41.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: 6 42.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 2: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 2: ================================================================================ 2: ============================= End of ROCm SMI Log ============================== 3: 3: 3: ======================= ROCm System Management Interface ======================= 3: ================================= Concise Info ================================= 3: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 3: 0 46.0c 84.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 1 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: 2 41.0c 88.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 3 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: 4 45.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 5 51.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: 6 42.0c 87.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 3: 7 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 3: ================================================================================ 3: ============================= End of ROCm SMI Log ============================== 5: 5: 5: ======================= ROCm System Management Interface ======================= 5: ================================= Concise Info ================================= 5: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 5: 0 44.0c 92.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 1 47.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: 2 39.0c 89.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 3 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: 4 43.0c 81.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 5 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: 6 36.0c 83.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 5: 7 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 5: ================================================================================ 5: ============================= End of ROCm SMI Log ============================== 7: 7: 7: ======================= ROCm System Management Interface ======================= 7: ================================= Concise Info ================================= 7: GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU% 7: 0 45.0c 90.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 1 44.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: 2 45.0c 82.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 3 45.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: 4 40.0c 86.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 5 50.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: 6 42.0c 91.0W 800Mhz 1600Mhz 0% auto 560.0W 0% 0% 7: 7 46.0c N/A 800Mhz 1600Mhz 0% auto 0.0W 0% 0% 7: ================================================================================ 7: ============================= End of ROCm SMI Log ============================== 2: Launching on nid005086 (2/8), master nid005084 port 9999, GPUs 8, CUDA: True 4: Launching on nid005088 (4/8), master nid005084 port 9999, GPUs 8, CUDA: True 3: Launching on nid005087 (3/8), master nid005084 port 9999, GPUs 8, CUDA: True 5: Launching on nid005089 (5/8), master nid005084 port 9999, GPUs 8, CUDA: True 1: Launching on nid005085 (1/8), master nid005084 port 9999, GPUs 8, CUDA: True 7: Launching on nid005091 (7/8), master nid005084 port 9999, GPUs 8, CUDA: True 0: Launching on nid005084 (0/8), master nid005084 port 9999, GPUs 8, CUDA: True 6: Launching on nid005090 (6/8), master nid005084 port 9999, GPUs 8, CUDA: True 0: using world size: 64, data-parallel-size: 64, tensor-model-parallel size: 1, pipeline-model-parallel size: 1 0: accumulate and all-reduce gradients in fp32 for bfloat16 data type. 0: using torch.bfloat16 for parameters ... 0: ------------------------ arguments ------------------------ 0: abort_on_unmet_fused_kernel_constraints ......... False 0: accumulate_allreduce_grads_in_fp32 .............. True 0: adam_beta1 ...................................... 0.9 0: adam_beta2 ...................................... 0.999 0: adam_eps ........................................ 1e-08 0: adlr_autoresume ................................. False 0: adlr_autoresume_interval ........................ 1000 0: apply_query_key_layer_scaling ................... True 0: apply_residual_connection_post_layernorm ........ False 0: attention_dropout ............................... 0.1 0: attention_softmax_in_fp32 ....................... False 0: bert_binary_head ................................ True 0: bert_load ....................................... None 0: bf16 ............................................ True 0: bias_dropout_fusion ............................. True 0: bias_gelu_fusion ................................ True 0: biencoder_projection_dim ........................ 0 0: biencoder_shared_query_context_model ............ False 0: block_data_path ................................. None 0: checkpoint_activations .......................... False 0: checkpoint_in_cpu ............................... False 0: checkpoint_num_layers ........................... 1 0: clip_grad ....................................... 1.0 0: codecarbon_dir .................................. None 0: consumed_train_samples .......................... 0 0: consumed_train_tokens ........................... 0 0: consumed_valid_samples .......................... 0 0: contigious_checkpointing ........................ False 0: cpu_optimizer ................................... False 0: cpu_torch_adam .................................. False 0: curriculum_learning ............................. False 0: data_impl ....................................... mmap 0: data_parallel_size .............................. 64 0: data_path ....................................... ['/scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document'] 0: dataloader_type ................................. single 0: DDP_impl ........................................ local 0: decoder_seq_length .............................. None 0: deepscale ....................................... False 0: deepscale_config ................................ None 0: deepspeed ....................................... True 0: deepspeed_activation_checkpointing .............. False 0: deepspeed_config ................................ ds_configs/2076210.json 0: deepspeed_mpi ................................... False 0: distribute_checkpointed_activations ............. False 0: distributed_backend ............................. nccl 0: embed_layernorm ................................. False 0: embedding_path .................................. None 0: encoder_seq_length .............................. 2048 0: eod_mask_loss ................................... False 0: eval_interval ................................... 1000 0: eval_iters ...................................... 1 0: eval_only ....................................... None 0: evidence_data_path .............................. None 0: exit_duration_in_mins ........................... None 0: exit_interval ................................... None 0: ffn_hidden_size ................................. 9216 0: finetune ........................................ False 0: fp16 ............................................ False 0: fp16_lm_cross_entropy ........................... False 0: fp32_residual_connection ........................ False 0: gigaflos_no_embeds .............................. 0 0: global_batch_size ............................... 512 0: glu_activation .................................. None 0: hidden_dropout .................................. 0.1 0: hidden_size ..................................... 2304 0: hysteresis ...................................... 2 0: ict_head_size ................................... None 0: ict_load ........................................ None 0: img_dim ......................................... 224 0: indexer_batch_size .............................. 128 0: indexer_log_interval ............................ 1000 0: inference ....................................... False 0: init_method_std ................................. 0.02 0: init_method_xavier_uniform ...................... False 0: initial_loss_scale .............................. 4294967296 0: kill_switch_path ................................ kill-switch-2b2 0: kv_channels ..................................... 128 0: layer_norm_fusion ............................... True 0: layernorm_epsilon ............................... 1e-05 0: lazy_mpu_init ................................... None 0: load ............................................ checkpoints_2b2 0: local_rank ...................................... None 0: log_batch_size_to_tensorboard ................... True 0: log_interval .................................... 10 0: log_learning_rate_to_tensorboard ................ True 0: log_level ....................................... None 0: log_level_replica ............................... None 0: log_loss_scale_to_tensorboard ................... True 0: log_num_zeros_in_grad ........................... False 0: log_params_norm ................................. False 0: log_path ........................................ None 0: log_timers_to_tensorboard ....................... True 0: log_validation_ppl_to_tensorboard ............... True 0: loss_on_targets_only ............................ False 0: loss_scale ...................................... None 0: loss_scale_window ............................... 1000 0: lr .............................................. 0.0002 0: lr_decay_iters .................................. None 0: lr_decay_samples ................................ 22565693 0: lr_decay_style .................................. cosine 0: lr_decay_tokens ................................. None 0: lr_warmup_fraction .............................. None 0: lr_warmup_iters ................................. 0 0: lr_warmup_samples ............................... 225657 0: make_vocab_size_divisible_by .................... 128 0: mask_prob ....................................... 0.15 0: masked_softmax_fusion ........................... True 0: max_position_embeddings ......................... 2048 0: mean_noise_span_length .......................... None 0: memory_centric_tiled_linear ..................... False 0: merge_file ...................................... gpt2/merges.txt 0: micro_batch_size ................................ 2 0: min_loss_scale .................................. 1.0 0: min_lr .......................................... 2e-05 0: mmap_warmup ..................................... False 0: no_load_optim ................................... None 0: no_load_rng ..................................... None 0: no_save_optim ................................... None 0: no_save_rng ..................................... None 0: noise_density ................................... None 0: num_attention_heads ............................. 18 0: num_channels .................................... 3 0: num_classes ..................................... 1000 0: num_layers ...................................... 32 0: num_layers_per_virtual_pipeline_stage ........... None 0: num_workers ..................................... 2 0: onnx_safe ....................................... None 0: openai_gelu ..................................... False 0: optimizer ....................................... adam 0: optimizer_fusion ................................ True 0: override_lr_scheduler ........................... False 0: pad_vocab_size_to ............................... None 0: params_dtype .................................... torch.bfloat16 0: partition_activations ........................... False 0: patch_dim ....................................... 16 0: pipeline_model_parallel_size .................... 1 0: position_embedding_type ......................... PositionEmbeddingType.absolute 0: pp_partition_method ............................. None 0: profile_backward ................................ False 0: query_in_block_prob ............................. 0.1 0: rampup_batch_size ............................... None 0: rank ............................................ 0 0: remote_device ................................... none 0: reset_attention_mask ............................ False 0: reset_position_ids .............................. False 0: retriever_report_topk_accuracies ................ [] 0: retriever_score_scaling ......................... False 0: retriever_seq_length ............................ 256 0: reweight_loss_based_on_position_frequency ....... False 0: sample_rate ..................................... 1.0 0: save ............................................ checkpoints_2b2 0: save_interval ................................... 1000 0: scatter_gather_tensors_in_pipeline .............. True 0: scattered_embeddings ............................ False 0: seed ............................................ 1234 0: seq_length ...................................... 2048 0: sgd_momentum .................................... 0.9 0: short_seq_prob .................................. 0.1 0: skip_train_iteration_range ...................... None 0: split ........................................... 949,50,1 0: split_transformers .............................. False 0: sync_tp_duplicated_parameters ................... False 0: synchronize_each_layer .......................... False 0: tensor_model_parallel_size ...................... 1 0: tensorboard_dir ................................. tensorboard_2b2 0: tensorboard_log_interval ........................ 1 0: tensorboard_queue_size .......................... 5 0: test_weighted_split_names ....................... None 0: test_weighted_split_paths ....................... None 0: test_weighted_split_paths_path .................. None 0: test_weighted_split_splits ...................... None 0: test_weighted_split_weights ..................... None 0: tile_factor ..................................... 1 0: titles_data_path ................................ None 0: tokenizer_name_or_path .......................... None 0: tokenizer_type .................................. GPT2BPETokenizer 0: train_iters ..................................... None 0: train_samples ................................... 22565693 0: train_tokens .................................... None 0: train_weighted_split_paths ...................... None 0: train_weighted_split_paths_path ................. None 0: universal_checkpoint ............................ False 0: use_bnb_optimizer ............................... False 0: use_checkpoint_lr_scheduler ..................... False 0: use_contiguous_buffers_in_ddp ................... True 0: use_cpu_initialization .......................... None 0: use_one_sent_docs ............................... False 0: use_pin_memory .................................. False 0: valid_num_workers ............................... 2 0: valid_weighted_split_names ...................... None 0: valid_weighted_split_paths ...................... None 0: valid_weighted_split_paths_path ................. None 0: valid_weighted_split_splits ..................... None 0: valid_weighted_split_weights .................... None 0: virtual_pipeline_model_parallel_size ............ None 0: vocab_extra_ids ................................. 0 0: vocab_file ...................................... gpt2/vocab.json 0: weight_decay .................................... 0.1 0: world_size ...................................... 64 0: zero_allgather_bucket_size ...................... 0.0 0: zero_contigious_gradients ....................... False 0: zero_reduce_bucket_size ......................... 0.0 0: zero_reduce_scatter ............................. False 0: zero_stage ...................................... 0 0: -------------------- end of arguments --------------------- 0: setting number of micro-batches to constant 4 0: > building GPT2BPETokenizer tokenizer ... 0: > padded vocab (size: 50257) with 47 dummy tokens (new size: 50304) 0: DeepSpeed general environment info: 0: torch install path ............... ['/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch'] 0: torch version .................... 1.13.0+rocm5.2 0: torch cuda version ............... None 0: torch hip version ................ 5.2.21151-afdc89f8 0: nvcc version ..................... None 0: deepspeed install path ........... ['/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/deepspeed'] 0: deepspeed info ................... 0.7.5, unknown, unknown 0: deepspeed wheel compiled w. ...... torch 1.13, hip 5.1 0: **** Git info for Megatron: git_hash=unknown git_branch=unknown **** 0: > initializing torch distributed ... 0: [2022-11-27 20:40:38,973] [INFO] [comm.py:633:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl 7: > setting tensorboard ... 0: > initializing tensor model parallel with size 1 0: > initializing pipeline model parallel with size 1 0: > setting random seeds to 1234 ... 0: > initializing model parallel cuda seeds on global rank 0, model parallel rank 0, and data parallel rank 0 with model parallel seed: 3952 and data parallel seed: 1234 0: > compiling dataset index builder ... 0: make: Entering directory '/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/data' 0: make: Nothing to be done for 'default'. 0: make: Leaving directory '/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/data' 0: >>> done with dataset index builder. Compilation time: 0.100 seconds 0: > compiling and loading fused kernels ... 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax.cpp -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_hip.cpp [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_hip.h [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_cuda.cu -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_hip.hip [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_hip.h [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax_hip.h [skipped, already hipified] 0: Total number of unsupported CUDA function calls: 0 0: 0: 0: Total number of replaced kernel launches: 87 0: [1/1] c++ scaled_upper_triang_masked_softmax_hip.cuda.o scaled_upper_triang_masked_softmax_hip.o -shared -L/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/lib -lc10 -lc10_hip -ltorch_cpu -ltorch_hip -ltorch -ltorch_python -L/pfs/lustrep2/projappl/project_462000125/samantao-public/rocm/rocm-5.2.3/lib -lamdhip64 -o scaled_upper_triang_masked_softmax_cuda.so 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax.cpp -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax_hip.cpp [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax_cuda.cu -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax_hip.hip [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_hip.h [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax_hip.h [skipped, already hipified] 0: Total number of unsupported CUDA function calls: 0 0: 0: 0: Total number of replaced kernel launches: 63 0: ninja: no work to do. 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_cuda.cpp -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_cuda.cpp [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_cuda_kernel.cu -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_hip_kernel.hip [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/type_shim.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/compat.h [skipped, no changes] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_upper_triang_masked_softmax_hip.h [skipped, already hipified] 0: /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax.h -> /pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/Megatron-DeepSpeed/megatron/fused_kernels/scaled_masked_softmax_hip.h [skipped, already hipified] 0: Total number of unsupported CUDA function calls: 0 0: 0: 0: Total number of replaced kernel launches: 67 0: [1/1] c++ layer_norm_cuda.o layer_norm_hip_kernel.cuda.o -shared -L/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/venv/lib/python3.9/site-packages/torch/lib -lc10 -lc10_hip -ltorch_cpu -ltorch_hip -ltorch -ltorch_python -L/pfs/lustrep2/projappl/project_462000125/samantao-public/rocm/rocm-5.2.3/lib -lamdhip64 -o fused_mix_prec_layer_norm_cuda.so 0: >>> done with compiling and loading fused kernels. Compilation time: 17.585 seconds 0: time to initialize megatron (seconds): 60.121 0: [after megatron is initialized] datetime: 2022-11-27 20:41:00 0: building GPT model ... 0: [2022-11-27 20:41:00,174] [INFO] [utils.py:827:see_memory_usage] Before Building Model 0: [2022-11-27 20:41:00,174] [INFO] [utils.py:828:see_memory_usage] MA 0.0 GB Max_MA 0.0 GB CA 0.0 GB Max_CA 0 GB 0: [2022-11-27 20:41:00,174] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 29.15 GB, percent = 5.8% 0: SEED_LAYERS=False BASE_SEED=1234 SEED_FN=None 0: Using topology: {ProcessCoord(pipe=0, data=0, model=0): 0, ProcessCoord(pipe=0, data=1, model=0): 1, ProcessCoord(pipe=0, data=2, model=0): 2, ProcessCoord(pipe=0, data=3, model=0): 3, ProcessCoord(pipe=0, data=4, model=0): 4, ProcessCoord(pipe=0, data=5, model=0): 5, ProcessCoord(pipe=0, data=6, model=0): 6, ProcessCoord(pipe=0, data=7, model=0): 7, ProcessCoord(pipe=0, data=8, model=0): 8, ProcessCoord(pipe=0, data=9, model=0): 9, ProcessCoord(pipe=0, data=10, model=0): 10, ProcessCoord(pipe=0, data=11, model=0): 11, ProcessCoord(pipe=0, data=12, model=0): 12, ProcessCoord(pipe=0, data=13, model=0): 13, ProcessCoord(pipe=0, data=14, model=0): 14, ProcessCoord(pipe=0, data=15, model=0): 15, ProcessCoord(pipe=0, data=16, model=0): 16, ProcessCoord(pipe=0, data=17, model=0): 17, ProcessCoord(pipe=0, data=18, model=0): 18, ProcessCoord(pipe=0, data=19, model=0): 19, ProcessCoord(pipe=0, data=20, model=0): 20, ProcessCoord(pipe=0, data=21, model=0): 21, ProcessCoord(pipe=0, data=22, model=0): 22, ProcessCoord(pi 0: pe=0, data=23, model=0): 23, ProcessCoord(pipe=0, data=24, model=0): 24, ProcessCoord(pipe=0, data=25, model=0): 25, ProcessCoord(pipe=0, data=26, model=0): 26, ProcessCoord(pipe=0, data=27, model=0): 27, ProcessCoord(pipe=0, data=28, model=0): 28, ProcessCoord(pipe=0, data=29, model=0): 29, ProcessCoord(pipe=0, data=30, model=0): 30, ProcessCoord(pipe=0, data=31, model=0): 31, ProcessCoord(pipe=0, data=32, model=0): 32, ProcessCoord(pipe=0, data=33, model=0): 33, ProcessCoord(pipe=0, data=34, model=0): 34, ProcessCoord(pipe=0, data=35, model=0): 35, ProcessCoord(pipe=0, data=36, model=0): 36, ProcessCoord(pipe=0, data=37, model=0): 37, ProcessCoord(pipe=0, data=38, model=0): 38, ProcessCoord(pipe=0, data=39, model=0): 39, ProcessCoord(pipe=0, data=40, model=0): 40, ProcessCoord(pipe=0, data=41, model=0): 41, ProcessCoord(pipe=0, data=42, model=0): 42, ProcessCoord(pipe=0, data=43, model=0): 43, ProcessCoord(pipe=0, data=44, model=0): 44, ProcessCoord(pipe=0, data=45, model=0): 45, ProcessCoord(pipe=0, data=4 0: 6, model=0): 46, ProcessCoord(pipe=0, data=47, model=0): 47, ProcessCoord(pipe=0, data=48, model=0): 48, ProcessCoord(pipe=0, data=49, model=0): 49, ProcessCoord(pipe=0, data=50, model=0): 50, ProcessCoord(pipe=0, data=51, model=0): 51, ProcessCoord(pipe=0, data=52, model=0): 52, ProcessCoord(pipe=0, data=53, model=0): 53, ProcessCoord(pipe=0, data=54, model=0): 54, ProcessCoord(pipe=0, data=55, model=0): 55, ProcessCoord(pipe=0, data=56, model=0): 56, ProcessCoord(pipe=0, data=57, model=0): 57, ProcessCoord(pipe=0, data=58, model=0): 58, ProcessCoord(pipe=0, data=59, model=0): 59, ProcessCoord(pipe=0, data=60, model=0): 60, ProcessCoord(pipe=0, data=61, model=0): 61, ProcessCoord(pipe=0, data=62, model=0): 62, ProcessCoord(pipe=0, data=63, model=0): 63} 0: [2022-11-27 20:41:02,182] [INFO] [module.py:366:_partition_layers] Partitioning pipeline stages with method type:transformer 0: stage=0 layers=39 0: 0: _to_float16 0: 1: EmbeddingPipe 0: 2: 0: 3: ParallelTransformerLayerPipe 0: 4: ParallelTransformerLayerPipe 0: 5: ParallelTransformerLayerPipe 0: 6: ParallelTransformerLayerPipe 0: 7: ParallelTransformerLayerPipe 0: 8: ParallelTransformerLayerPipe 0: 9: ParallelTransformerLayerPipe 0: 10: ParallelTransformerLayerPipe 0: 11: ParallelTransformerLayerPipe 0: 12: ParallelTransformerLayerPipe 0: 13: ParallelTransformerLayerPipe 0: 14: ParallelTransformerLayerPipe 0: 15: ParallelTransformerLayerPipe 0: 16: ParallelTransformerLayerPipe 0: 17: ParallelTransformerLayerPipe 0: 18: ParallelTransformerLayerPipe 0: 19: ParallelTransformerLayerPipe 0: 20: ParallelTransformerLayerPipe 0: 21: ParallelTransformerLayerPipe 0: 22: ParallelTransformerLayerPipe 0: 23: ParallelTransformerLayerPipe 0: 24: ParallelTransformerLayerPipe 0: 25: ParallelTransformerLayerPipe 0: 26: ParallelTransformerLayerPipe 0: 27: ParallelTransformerLayerPipe 0: 28: ParallelTransformerLayerPipe 0: 29: ParallelTransformerLayerPipe 0: 30: ParallelTransformerLayerPipe 0: 31: ParallelTransformerLayerPipe 0: 32: ParallelTransformerLayerPipe 0: 33: ParallelTransformerLayerPipe 0: 34: ParallelTransformerLayerPipe 0: 35: undo 0: 36: MixedFusedLayerNorm 0: 37: EmbeddingPipe 0: 38: float16_to_fp32 0: loss: CrossEntropy 0: [2022-11-27 20:41:02,404] [INFO] [utils.py:827:see_memory_usage] After Building Model 0: [2022-11-27 20:41:02,404] [INFO] [utils.py:828:see_memory_usage] MA 4.03 GB Max_MA 4.03 GB CA 4.24 GB Max_CA 4 GB 0: [2022-11-27 20:41:02,404] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 29.19 GB, percent = 5.8% 0: setting training iterations to 44073 0: > learning rate decay style: cosine 0: DeepSpeed is enabled. 0: [2022-11-27 20:41:02,407] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.5, git-hash=unknown, git-branch=unknown 0: [2022-11-27 20:41:15,135] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False 0: [2022-11-27 20:41:15,135] [INFO] [logging.py:68:log_dist] [Rank 0] Removing param_group that has no 'params' in the client Optimizer 0: [2022-11-27 20:41:15,135] [INFO] [logging.py:68:log_dist] [Rank 0] Using client Optimizer as basic optimizer 0: [2022-11-27 20:41:15,153] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Basic Optimizer = FusedAdam 0: [2022-11-27 20:41:15,153] [INFO] [logging.py:68:log_dist] [Rank 0] Creating BF16 optimizer 0: [2022-11-27 20:41:15,200] [INFO] [utils.py:827:see_memory_usage] begin bf16_optimizer 0: [2022-11-27 20:41:15,200] [INFO] [utils.py:828:see_memory_usage] MA 4.02 GB Max_MA 4.04 GB CA 4.26 GB Max_CA 4 GB 0: [2022-11-27 20:41:15,200] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 29.86 GB, percent = 5.9% 1: ninja: no work to do. 5: Time to load utils op: 0.21301913261413574 seconds 1: Time to load utils op: 0.21332812309265137 seconds 1: Time to load utils op: 0.20241618156433105 seconds 1: Time to load utils op: 0.2023482322692871 seconds 1: Time to load utils op: 0.2027125358581543 seconds 1: Time to load utils op: 0.2027134895324707 seconds 1: Time to load utils op: 0.20249390602111816 seconds 1: Time to load utils op: 0.20306038856506348 seconds 1: Time to load utils op: 0.20321106910705566 seconds 5: Time to load utils op: 0.20226669311523438 seconds 5: Time to load utils op: 0.2022697925567627 seconds 5: Time to load utils op: 0.20238685607910156 seconds 5: Time to load utils op: 0.20185375213623047 seconds 5: Time to load utils op: 0.20225214958190918 seconds 5: Time to load utils op: 0.20242667198181152 seconds 5: Time to load utils op: 0.2032032012939453 seconds 0: Time to load utils op: 0.21285557746887207 seconds 0: Time to load utils op: 0.21270442008972168 secondsTime to load utils op: 0.21277236938476562 seconds 0: Time to load utils op: 0.21056795120239258 seconds 0: Time to load utils op: 0.212843656539917 seconds 0: 0: Time to load utils op: 0.2129042148590088 secondsTime to load utils op: 0.21289730072021484 seconds 0: 2: Time to load utils op: 0.21064209938049316 seconds 2: Time to load utils op: 0.21095037460327148 seconds 2: Time to load utils op: 0.21067261695861816 seconds 2: Time to load utils op: 0.21176958084106445 seconds 2: Time to load utils op: 0.211683988571167 secondsTime to load utils op: 0.21144843101501465 secondsTime to load utils op: 0.2121579647064209 seconds 2: 2: Time to load utils op: 0.2109675407409668 seconds 2: 3: Time to load utils op: 0.20969676971435547 seconds 3: Time to load utils op: 0.20970749855041504 seconds 3: Time to load utils op: 0.20433712005615234 seconds 3: Time to load utils op: 0.2043769359588623 seconds 3: Time to load utils op: 0.20973944664001465 seconds 3: Time to load utils op: 0.20220446586608887 seconds 3: Time to load utils op: 0.20206522941589355 seconds 0: Time to load utils op: 0.20184087753295898 seconds 7: Time to load utils op: 0.21090435981750488 seconds 4: Time to load utils op: 0.21149516105651855 seconds 4: Time to load utils op: 0.21149969100952148 seconds 4: Time to load utils op: 0.21152281761169434 seconds 4: Time to load utils op: 0.21152663230895996 seconds 4: Time to load utils op: 0.21154117584228516 seconds 4: Time to load utils op: 0.2115461826324463 seconds 4: Time to load utils op: 0.21156525611877441 secondsTime to load utils op: 0.2115621566772461 seconds 4: 6: Time to load utils op: 0.21201610565185547 secondsTime to load utils op: 0.21202325820922852 seconds 6: 6: Time to load utils op: 0.21202540397644043 secondsTime to load utils op: 0.21202754974365234 seconds 6: 6: Time to load utils op: 0.21203970909118652 secondsTime to load utils op: 0.21204757690429688 seconds 6: 6: Time to load utils op: 0.21204352378845215 seconds 6: Time to load utils op: 0.21204328536987305 seconds 7: Time to load utils op: 0.20240259170532227 seconds 7: Time to load utils op: 0.20250439643859863 seconds 7: Time to load utils op: 0.20247697830200195 seconds 7: Time to load utils op: 0.20197296142578125 seconds 7: Time to load utils op: 0.20215201377868652 seconds 7: Time to load utils op: 0.20209527015686035 seconds 7: Time to load utils op: 0.20211291313171387 seconds 3: Time to load utils op: 0.2018575668334961 seconds 5: Time to load utils op: 0.0005044937133789062 seconds 5: Time to load utils op: 0.0004913806915283203 seconds 5: Time to load utils op: 0.0005359649658203125 secondsTime to load utils op: 0.0005559921264648438 seconds 5: 5: Time to load utils op: 0.0005099773406982422 seconds 5: Time to load utils op: 0.0005543231964111328 secondsTime to load utils op: 0.0005674362182617188 secondsTime to load utils op: 0.0005548000335693359 seconds 5: 5: 0: [2022-11-27 20:41:15,450] [INFO] [utils.py:827:see_memory_usage] before initializing group 0 0: [2022-11-27 20:41:15,450] [INFO] [utils.py:828:see_memory_usage] MA 4.02 GB Max_MA 4.02 GB CA 4.26 GB Max_CA 4 GB 0: [2022-11-27 20:41:15,450] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 29.87 GB, percent = 5.9% 0: Time to load utils op: 0.00048279762268066406 seconds 0: Time to load utils op: 0.0004410743713378906 seconds 0: Time to load utils op: 0.0004200935363769531 seconds 0: Time to load utils op: 0.0004119873046875 seconds 0: Time to load utils op: 0.0004115104675292969 seconds 0: Time to load utils op: 0.00043392181396484375 seconds 0: Time to load utils op: 0.00039958953857421875 seconds 2: Time to load utils op: 0.0009856224060058594 seconds 2: Time to load utils op: 0.0013763904571533203 seconds 2: Time to load utils op: 0.0013272762298583984 secondsTime to load utils op: 0.0014257431030273438 seconds 2: 2: Time to load utils op: 0.0013859272003173828 seconds 2: Time to load utils op: 0.0013735294342041016 seconds 2: Time to load utils op: 0.0013501644134521484 seconds 2: Time to load utils op: 0.0013494491577148438 seconds 6: Time to load utils op: 0.0008912086486816406 seconds 6: Time to load utils op: 0.0008401870727539062 seconds 6: Time to load utils op: 0.001096487045288086 seconds 6: Time to load utils op: 0.0010957717895507812 seconds 6: Time to load utils op: 0.0010721683502197266 secondsTime to load utils op: 0.0010726451873779297 seconds 6: 6: Time to load utils op: 0.0010306835174560547 seconds 6: Time to load utils op: 0.0011444091796875 seconds 4: Time to load utils op: 0.0008699893951416016 seconds 4: Time to load utils op: 0.0009491443634033203 seconds 4: Time to load utils op: 0.0011019706726074219 seconds 4: Time to load utils op: 0.0012426376342773438 seconds 4: Time to load utils op: 0.0012717247009277344 seconds 4: Time to load utils op: 0.0011878013610839844 secondsTime to load utils op: 0.001220703125 seconds 4: 4: Time to load utils op: 0.0013086795806884766 seconds 1: Time to load utils op: 0.0004863739013671875 seconds 1: Time to load utils op: 0.0004801750183105469 seconds 1: Time to load utils op: 0.00043463706970214844 secondsTime to load utils op: 0.0004401206970214844 seconds 1: 1: Time to load utils op: 0.0004248619079589844 secondsTime to load utils op: 0.00044918060302734375 seconds 1: 1: Time to load utils op: 0.0004215240478515625 seconds 3: Time to load utils op: 0.0004680156707763672 seconds 1: Time to load utils op: 0.0004756450653076172 seconds 3: Time to load utils op: 0.000362396240234375 seconds 3: Time to load utils op: 0.0004279613494873047 secondsTime to load utils op: 0.00043129920959472656 seconds 3: 3: Time to load utils op: 0.0003592967987060547 seconds 3: Time to load utils op: 0.0003533363342285156 seconds 3: Time to load utils op: 0.0003962516784667969 secondsTime to load utils op: 0.00038814544677734375 seconds 3: 7: Time to load utils op: 0.0005292892456054688 secondsTime to load utils op: 0.0005903244018554688 seconds 7: 7: Time to load utils op: 0.0004062652587890625 seconds 7: Time to load utils op: 0.0005574226379394531 seconds 7: Time to load utils op: 0.00043201446533203125 seconds 7: Time to load utils op: 0.00041413307189941406 seconds 7: Time to load utils op: 0.00042629241943359375 seconds 7: Time to load utils op: 0.0004096031188964844 seconds 0: [2022-11-27 20:41:15,506] [INFO] [utils.py:827:see_memory_usage] after initializing group 0 0: [2022-11-27 20:41:15,506] [INFO] [utils.py:828:see_memory_usage] MA 8.26 GB Max_MA 8.26 GB CA 10.51 GB Max_CA 11 GB 0: [2022-11-27 20:41:15,506] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,542] [INFO] [utils.py:827:see_memory_usage] before initializing group 1 0: [2022-11-27 20:41:15,543] [INFO] [utils.py:828:see_memory_usage] MA 8.26 GB Max_MA 8.26 GB CA 10.51 GB Max_CA 11 GB 0: [2022-11-27 20:41:15,543] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,577] [INFO] [utils.py:827:see_memory_usage] after initializing group 1 0: [2022-11-27 20:41:15,577] [INFO] [utils.py:828:see_memory_usage] MA 12.19 GB Max_MA 12.19 GB CA 16.33 GB Max_CA 16 GB 0: [2022-11-27 20:41:15,577] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,608] [INFO] [utils.py:827:see_memory_usage] before initializing group 2 0: [2022-11-27 20:41:15,609] [INFO] [utils.py:828:see_memory_usage] MA 12.19 GB Max_MA 12.19 GB CA 16.33 GB Max_CA 16 GB 0: [2022-11-27 20:41:15,609] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,645] [INFO] [utils.py:827:see_memory_usage] after initializing group 2 0: [2022-11-27 20:41:15,645] [INFO] [utils.py:828:see_memory_usage] MA 12.2 GB Max_MA 12.2 GB CA 16.33 GB Max_CA 16 GB 0: [2022-11-27 20:41:15,646] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,676] [INFO] [utils.py:827:see_memory_usage] before initialize_optimizer 0: [2022-11-27 20:41:15,676] [INFO] [utils.py:828:see_memory_usage] MA 12.2 GB Max_MA 12.2 GB CA 16.33 GB Max_CA 16 GB 0: [2022-11-27 20:41:15,676] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,711] [INFO] [utils.py:827:see_memory_usage] end initialize_optimizer 0: [2022-11-27 20:41:15,712] [INFO] [utils.py:828:see_memory_usage] MA 12.45 GB Max_MA 12.45 GB CA 16.52 GB Max_CA 17 GB 0: [2022-11-27 20:41:15,712] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,743] [INFO] [utils.py:827:see_memory_usage] end bf16_optimizer 0: [2022-11-27 20:41:15,743] [INFO] [utils.py:828:see_memory_usage] MA 12.45 GB Max_MA 12.45 GB CA 16.52 GB Max_CA 17 GB 0: [2022-11-27 20:41:15,743] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 30.02 GB, percent = 6.0% 0: [2022-11-27 20:41:15,743] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = FusedAdam 0: [2022-11-27 20:41:15,743] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed using client LR scheduler 0: [2022-11-27 20:41:15,743] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = 0: [2022-11-27 20:41:15,744] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[0.0, 0.0, 0.0], mom=[(0.9, 0.999), (0.9, 0.999), (0.9, 0.999)] 0: [2022-11-27 20:41:15,744] [INFO] [config.py:1007:print] DeepSpeedEngine configuration: 0: [2022-11-27 20:41:15,744] [INFO] [config.py:1011:print] activation_checkpointing_config { 0: "partition_activations": false, 0: "contiguous_memory_optimization": false, 0: "cpu_checkpointing": false, 0: "number_checkpoints": null, 0: "synchronize_checkpoint_boundary": false, 0: "profile": false 0: } 0: [2022-11-27 20:41:15,744] [INFO] [config.py:1011:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True} 0: [2022-11-27 20:41:15,744] [INFO] [config.py:1011:print] amp_enabled .................. False 0: [2022-11-27 20:41:15,744] [INFO] [config.py:1011:print] amp_params ................... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] autotuning_config ............ { 0: "enabled": false, 0: "start_step": null, 0: "end_step": null, 0: "metric_path": null, 0: "arg_mappings": null, 0: "metric": "throughput", 0: "model_info": null, 0: "results_dir": "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/autotuning_results", 0: "exps_dir": "/pfs/lustrep4/scratch/project_462000119/muennighoff/nov-2022-bettercom/autotuning_exps", 0: "overwrite": true, 0: "fast": true, 0: "start_profile_step": 3, 0: "end_profile_step": 5, 0: "tuner_type": "gridsearch", 0: "tuner_early_stopping": 5, 0: "tuner_num_trials": 50, 0: "model_info_path": null, 0: "mp_size": 1, 0: "max_train_batch_size": null, 0: "min_train_batch_size": 1, 0: "max_train_micro_batch_size_per_gpu": 1.024000e+03, 0: "min_train_micro_batch_size_per_gpu": 1, 0: "num_tuning_micro_batch_sizes": 3 0: } 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] bfloat16_enabled ............. True 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] checkpoint_parallel_write_pipeline False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] checkpoint_tag_validation_enabled True 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] checkpoint_tag_validation_fail False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] comms_config ................. 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] communication_data_type ...... None 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_pa 0: rameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}} 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] curriculum_enabled ........... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] curriculum_params ............ False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] dataloader_drop_last ......... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] disable_allgather ............ False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] dump_state ................... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] dynamic_loss_scale_args ...... None 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_enabled ........... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_gas_boundary_resolution 1 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_layer_name ........ bert.encoder.layer 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_layer_num ......... 0 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_max_iter .......... 100 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_stability ......... 1e-06 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_tol ............... 0.01 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] eigenvalue_verbose ........... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] elasticity_enabled ........... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] flops_profiler_config ........ { 0: "enabled": false, 0: "profile_step": 1, 0: "module_depth": -1, 0: "top_modules": 1, 0: "detailed": true, 0: "output_file": null 0: } 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] fp16_auto_cast ............... None 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] fp16_enabled ................. False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] fp16_master_weights_and_gradients False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] global_rank .................. 0 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] gradient_accumulation_steps .. 4 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] gradient_clipping ............ 1.0 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] gradient_predivide_factor .... 1.0 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] initial_dynamic_scale ........ 1 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] load_universal_checkpoint .... False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] loss_scale ................... 1.0 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] memory_breakdown ............. False 0: [2022-11-27 20:41:15,745] [INFO] [config.py:1011:print] monitor_config ............... 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] nebula_config ................ { 0: "enabled": false, 0: "persistent_storage_path": null, 0: "persistent_time_interval": 100, 0: "num_of_version_in_retention": 2, 0: "enable_nebula_load": true, 0: "load_path": null 0: } 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] optimizer_legacy_fusion ...... False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] optimizer_name ............... None 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] optimizer_params ............. None 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0} 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] pld_enabled .................. False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] pld_params ................... False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] prescale_gradients ........... False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] scheduler_name ............... None 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] scheduler_params ............. None 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] sparse_attention ............. None 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] sparse_gradients_enabled ..... False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] steps_per_print .............. 2000 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] train_batch_size ............. 512 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] train_micro_batch_size_per_gpu 2 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] use_node_local_storage ....... False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] wall_clock_breakdown ......... False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] world_size ................... 64 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] zero_allow_untested_optimizer False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] zero_config .................. stage=0 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=500000000 allgather_partitions=True allgather_bucket_size=500000000 overlap_comm=False load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=None sub_group_size=1000000000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50000000 param_persistence_threshold=100000 model_persistence_threshold=9223372036854775807 max_live_parameters=1000000000 max_reuse_distance=1000000000 gather_16bit_weights_on_model_save=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] zero_enabled ................. False 0: [2022-11-27 20:41:15,746] [INFO] [config.py:1011:print] zero_optimization_stage ...... 0 0: [2022-11-27 20:41:15,746] [INFO] [config.py:996:print_user_config] json = { 0: "train_micro_batch_size_per_gpu": 2, 0: "train_batch_size": 512, 0: "gradient_clipping": 1.0, 0: "zero_optimization": { 0: "stage": 0 0: }, 0: "bf16": { 0: "enabled": true 0: }, 0: "steps_per_print": 2.000000e+03, 0: "wall_clock_breakdown": false 0: } 0: Time to load utils op: 0.00040149688720703125 seconds 0: [2022-11-27 20:41:15,747] [INFO] [engine.py:87:__init__] CONFIG: micro_batches=4 micro_batch_size=2 0: [2022-11-27 20:41:15,798] [INFO] [engine.py:145:__init__] RANK=0 STAGE=0 LAYERS=39 [0, 39) STAGE_PARAMS=2160013824 (2160.014M) TOTAL_PARAMS=2160013824 (2160.014M) UNIQUE_PARAMS=2160013824 (2160.014M) 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 6: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 1: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 2: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 5: [2022-11-27 20:41:15,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:15,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 0: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 7: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 2: [2022-11-27 20:41:15,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 1: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 4: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/mp_rank_00_model_states.pt. 5: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:15,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,256] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 0: [2022-11-27 20:41:16,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,279] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,279] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,281] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,288] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,288] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,293] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,293] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,294] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:16,294] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,295] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:16,297] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,302] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,302] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,302] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 1: [2022-11-27 20:41:16,302] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:16,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:16,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:16,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:16,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 6: [2022-11-27 20:41:16,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,306] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,306] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:16,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:16,308] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 4: [2022-11-27 20:41:16,308] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,309] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,309] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,309] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,309] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 2: [2022-11-27 20:41:16,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 3: [2022-11-27 20:41:16,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,315] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,315] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 5: [2022-11-27 20:41:16,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt... 7: [2022-11-27 20:41:16,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,341] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,362] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,363] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,364] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,365] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,366] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,368] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,370] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,372] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,377] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,380] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,390] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,391] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,394] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,394] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 7: [2022-11-27 20:41:16,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,399] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,401] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,403] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,403] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,403] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,403] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,404] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,405] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,405] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,406] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 4: [2022-11-27 20:41:16,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 1: [2022-11-27 20:41:16,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,410] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,411] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,411] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 2: [2022-11-27 20:41:16,411] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,415] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,417] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,417] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 6: [2022-11-27 20:41:16,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,427] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 5: [2022-11-27 20:41:16,428] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 0: [2022-11-27 20:41:16,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_01-model_00-model_states.pt. 3: [2022-11-27 20:41:16,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,433] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,434] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,435] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,436] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,445] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,448] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,745] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,747] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,778] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,782] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 6: [2022-11-27 20:41:16,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 2: [2022-11-27 20:41:16,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 1: [2022-11-27 20:41:16,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 3: [2022-11-27 20:41:16,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:16,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:16,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 3: [2022-11-27 20:41:16,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:16,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 5: [2022-11-27 20:41:16,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:16,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:16,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 2: [2022-11-27 20:41:16,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:16,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 5: [2022-11-27 20:41:16,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 0: [2022-11-27 20:41:16,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 6: [2022-11-27 20:41:16,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,862] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 4: [2022-11-27 20:41:16,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt... 7: [2022-11-27 20:41:16,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:16,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,866] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,866] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,866] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 7: [2022-11-27 20:41:16,867] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:16,867] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,869] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,869] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:16,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:16,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:16,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:16,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:16,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:16,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,885] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:16,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,886] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,886] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,890] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,901] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,902] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 1: [2022-11-27 20:41:16,904] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,906] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,909] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:16,909] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,921] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,921] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,922] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,922] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 4: [2022-11-27 20:41:16,923] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_03-model_00-model_states.pt. 0: [2022-11-27 20:41:16,923] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:16,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,951] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,952] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,959] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,959] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,959] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:16,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,088] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,091] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,094] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:17,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:17,096] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:17,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 6: [2022-11-27 20:41:17,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 5: [2022-11-27 20:41:17,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 7: [2022-11-27 20:41:17,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,100] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 4: [2022-11-27 20:41:17,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 0: [2022-11-27 20:41:17,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:17,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:17,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:17,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:17,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 2: [2022-11-27 20:41:17,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 3: [2022-11-27 20:41:17,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt... 1: [2022-11-27 20:41:17,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 4: [2022-11-27 20:41:17,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 2: [2022-11-27 20:41:17,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 1: [2022-11-27 20:41:17,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 3: [2022-11-27 20:41:17,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 7: [2022-11-27 20:41:17,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 6: [2022-11-27 20:41:17,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,174] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 0: [2022-11-27 20:41:17,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,177] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_04-model_00-model_states.pt. 5: [2022-11-27 20:41:17,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,195] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,435] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,445] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,449] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,452] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 6: [2022-11-27 20:41:17,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 4: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 1: [2022-11-27 20:41:17,476] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 5: [2022-11-27 20:41:17,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 2: [2022-11-27 20:41:17,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 7: [2022-11-27 20:41:17,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 0: [2022-11-27 20:41:17,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt... 3: [2022-11-27 20:41:17,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,523] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,524] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 3: [2022-11-27 20:41:17,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 7: [2022-11-27 20:41:17,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,533] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 4: [2022-11-27 20:41:17,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 6: [2022-11-27 20:41:17,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,540] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 5: [2022-11-27 20:41:17,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 2: [2022-11-27 20:41:17,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,554] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 1: [2022-11-27 20:41:17,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_05-model_00-model_states.pt. 0: [2022-11-27 20:41:17,578] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,593] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,593] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,595] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,595] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,777] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,783] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,783] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,787] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,790] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 1: [2022-11-27 20:41:17,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,815] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 2: [2022-11-27 20:41:17,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 3: [2022-11-27 20:41:17,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,823] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,823] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 4: [2022-11-27 20:41:17,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,842] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 6: [2022-11-27 20:41:17,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 0: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 5: [2022-11-27 20:41:17,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt... 7: [2022-11-27 20:41:17,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 7: [2022-11-27 20:41:17,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:17,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:17,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:17,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,849] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:17,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:17,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:17,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:17,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 3: [2022-11-27 20:41:17,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:17,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,862] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:17,862] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:17,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,866] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 6: [2022-11-27 20:41:17,867] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:17,867] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:17,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:17,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:17,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:17,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:17,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,873] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,873] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:17,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:17,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,886] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 1: [2022-11-27 20:41:17,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:17,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,893] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,895] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,896] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:17,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,897] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 2: [2022-11-27 20:41:17,898] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,898] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:17,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:17,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:17,900] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,903] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,904] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,904] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:17,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:17,906] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,913] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,913] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,914] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 5: [2022-11-27 20:41:17,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,914] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,914] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 0: [2022-11-27 20:41:17,914] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_06-model_00-model_states.pt. 4: [2022-11-27 20:41:17,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,922] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:17,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,939] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,940] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,940] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,941] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,946] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:17,948] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,151] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,177] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,205] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:18,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:18,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:18,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:18,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:18,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:18,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:18,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 3: [2022-11-27 20:41:18,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:18,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:18,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:18,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:18,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 6: [2022-11-27 20:41:18,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:18,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:18,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:18,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:18,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:18,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 4: [2022-11-27 20:41:18,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 5: [2022-11-27 20:41:18,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 7: [2022-11-27 20:41:18,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:18,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:18,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 0: [2022-11-27 20:41:18,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:18,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:18,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:18,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 2: [2022-11-27 20:41:18,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt... 1: [2022-11-27 20:41:18,229] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,245] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,245] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,246] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,250] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,251] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 0: [2022-11-27 20:41:18,261] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,266] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,266] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 4: [2022-11-27 20:41:18,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,269] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,269] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 5: [2022-11-27 20:41:18,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,271] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,272] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,272] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,273] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,273] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,273] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 1: [2022-11-27 20:41:18,273] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,273] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 6: [2022-11-27 20:41:18,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,278] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,278] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 2: [2022-11-27 20:41:18,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,281] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,283] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 3: [2022-11-27 20:41:18,283] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,283] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,283] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,284] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,284] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,285] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,285] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,288] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,289] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,290] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_07-model_00-model_states.pt. 7: [2022-11-27 20:41:18,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,294] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,294] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,295] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,295] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,296] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,296] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,296] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,296] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,297] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,297] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,299] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,302] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,304] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,454] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,471] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,479] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,484] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 7: [2022-11-27 20:41:18,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,536] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 3: [2022-11-27 20:41:18,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 6: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 5: [2022-11-27 20:41:18,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 4: [2022-11-27 20:41:18,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,554] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,554] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 1: [2022-11-27 20:41:18,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 6: [2022-11-27 20:41:18,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 0: [2022-11-27 20:41:18,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 2: [2022-11-27 20:41:18,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt... 3: [2022-11-27 20:41:18,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 7: [2022-11-27 20:41:18,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,568] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,574] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,578] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 5: [2022-11-27 20:41:18,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,584] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,587] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,589] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,590] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,594] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,594] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,594] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,596] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,600] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,602] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,604] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,605] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,606] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,610] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,610] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,610] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,612] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,612] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,612] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,612] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 4: [2022-11-27 20:41:18,613] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,615] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,615] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,615] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,616] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,616] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,620] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,626] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,626] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,626] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,626] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 0: [2022-11-27 20:41:18,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 2: [2022-11-27 20:41:18,628] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,632] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,635] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,636] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,636] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,637] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,637] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,637] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,637] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,638] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,639] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,639] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,640] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_08-model_00-model_states.pt. 1: [2022-11-27 20:41:18,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,643] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,650] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,652] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,661] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,794] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,797] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,806] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,828] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,828] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,851] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:18,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,862] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,862] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 6: [2022-11-27 20:41:18,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:18,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:18,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 5: [2022-11-27 20:41:18,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,871] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,871] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,873] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,873] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 6: [2022-11-27 20:41:18,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:18,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,875] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:18,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,883] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 3: [2022-11-27 20:41:18,883] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,883] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:18,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,890] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,890] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 7: [2022-11-27 20:41:18,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,891] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:18,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 3: [2022-11-27 20:41:18,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:18,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 0: [2022-11-27 20:41:18,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 4: [2022-11-27 20:41:18,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,897] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:18,897] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,901] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,901] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,906] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 5: [2022-11-27 20:41:18,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,911] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,913] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,913] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:18,913] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,916] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,917] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,917] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,917] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 1: [2022-11-27 20:41:18,917] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt... 2: [2022-11-27 20:41:18,918] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:18,921] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:18,923] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,924] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,924] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,924] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,925] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 4: [2022-11-27 20:41:18,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,927] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,931] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,931] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,931] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,934] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,947] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,949] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,949] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,949] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,950] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,950] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,951] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:18,952] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,956] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 2: [2022-11-27 20:41:18,956] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,956] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,960] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,970] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,970] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,971] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 7: [2022-11-27 20:41:18,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:18,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,974] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,974] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,974] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,974] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,974] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 1: [2022-11-27 20:41:18,974] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_09-model_00-model_states.pt. 0: [2022-11-27 20:41:18,984] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:18,993] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,993] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:18,998] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,003] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,003] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,005] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,127] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,129] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,175] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,178] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:19,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:19,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 3: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 6: [2022-11-27 20:41:19,195] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,195] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 6: [2022-11-27 20:41:19,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 2: [2022-11-27 20:41:19,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:19,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:19,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 0: [2022-11-27 20:41:19,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 5: [2022-11-27 20:41:19,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,233] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 4: [2022-11-27 20:41:19,237] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,239] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,239] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 2: [2022-11-27 20:41:19,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,241] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,241] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,241] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,242] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,242] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,243] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,243] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,243] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,244] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,244] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,256] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,260] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,270] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,271] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 5: [2022-11-27 20:41:19,270] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 3: [2022-11-27 20:41:19,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,276] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 4: [2022-11-27 20:41:19,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,276] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,277] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:19,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:19,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:19,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:19,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:19,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:19,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 7: [2022-11-27 20:41:19,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt... 1: [2022-11-27 20:41:19,281] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,290] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,296] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,304] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,304] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,304] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,305] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,306] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,309] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,319] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,321] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,321] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,321] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 1: [2022-11-27 20:41:19,322] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,327] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,328] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 0: [2022-11-27 20:41:19,331] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,331] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_10-model_00-model_states.pt. 7: [2022-11-27 20:41:19,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,340] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,349] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,355] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,358] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,359] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,484] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,486] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,487] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 5: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 1: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 2: [2022-11-27 20:41:19,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,533] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,533] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 2: [2022-11-27 20:41:19,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,536] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 6: [2022-11-27 20:41:19,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,554] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,558] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 5: [2022-11-27 20:41:19,561] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 0: [2022-11-27 20:41:19,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 6: [2022-11-27 20:41:19,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 1: [2022-11-27 20:41:19,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 3: [2022-11-27 20:41:19,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 4: [2022-11-27 20:41:19,571] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,574] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 4: [2022-11-27 20:41:19,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 7: [2022-11-27 20:41:19,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt... 3: [2022-11-27 20:41:19,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,586] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,586] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,587] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,616] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,620] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,632] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,640] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,647] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 7: [2022-11-27 20:41:19,650] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,651] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,651] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,652] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,653] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_11-model_00-model_states.pt. 0: [2022-11-27 20:41:19,655] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,655] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,660] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,673] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,794] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,827] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,847] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,848] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 3: [2022-11-27 20:41:19,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:19,862] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 1: [2022-11-27 20:41:19,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,865] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,866] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 4: [2022-11-27 20:41:19,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 3: [2022-11-27 20:41:19,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:19,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,875] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:19,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:19,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:19,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,883] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,884] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,885] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,890] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:19,890] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 7: [2022-11-27 20:41:19,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 5: [2022-11-27 20:41:19,895] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:19,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,899] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,900] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,900] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,902] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,903] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 2: [2022-11-27 20:41:19,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,904] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 0: [2022-11-27 20:41:19,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt... 6: [2022-11-27 20:41:19,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,906] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,906] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,907] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,907] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,907] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 2: [2022-11-27 20:41:19,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:19,911] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,912] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,913] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 5: [2022-11-27 20:41:19,915] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,916] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:19,916] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,916] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:19,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,920] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,923] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,923] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,924] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,925] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,927] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,927] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,927] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,934] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,934] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,934] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,934] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,934] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 6: [2022-11-27 20:41:19,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 4: [2022-11-27 20:41:19,936] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,936] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,936] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 1: [2022-11-27 20:41:19,936] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:19,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:19,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,943] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,944] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,945] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:19,945] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,946] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,952] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:19,952] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:19,954] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:19,954] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,955] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,955] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:19,955] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,958] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:19,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,961] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:19,961] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:19,962] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,962] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 7: [2022-11-27 20:41:19,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:19,970] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:19,973] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:19,974] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:19,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:19,984] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:19,987] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,000] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_12-model_00-model_states.pt. 0: [2022-11-27 20:41:20,005] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,006] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,010] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,010] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,020] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 1: [2022-11-27 20:41:20,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,161] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 3: [2022-11-27 20:41:20,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 5: [2022-11-27 20:41:20,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,191] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:20,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:20,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:20,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:20,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:20,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 6: [2022-11-27 20:41:20,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 7: [2022-11-27 20:41:20,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,199] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 4: [2022-11-27 20:41:20,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,204] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 0: [2022-11-27 20:41:20,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt... 2: [2022-11-27 20:41:20,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,210] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 7: [2022-11-27 20:41:20,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 2: [2022-11-27 20:41:20,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 4: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 5: [2022-11-27 20:41:20,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,231] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,234] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,234] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,236] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,237] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,238] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 1: [2022-11-27 20:41:20,238] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,246] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 3: [2022-11-27 20:41:20,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,250] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,255] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,255] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,260] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,272] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 6: [2022-11-27 20:41:20,273] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,273] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,277] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,280] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_13-model_00-model_states.pt. 0: [2022-11-27 20:41:20,300] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,301] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,302] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,471] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,493] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,502] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 7: [2022-11-27 20:41:20,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,509] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 2: [2022-11-27 20:41:20,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 5: [2022-11-27 20:41:20,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 1: [2022-11-27 20:41:20,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 3: [2022-11-27 20:41:20,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,519] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 6: [2022-11-27 20:41:20,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,526] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 0: [2022-11-27 20:41:20,527] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt... 4: [2022-11-27 20:41:20,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,536] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 4: [2022-11-27 20:41:20,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,553] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,553] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,553] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 4: [2022-11-27 20:41:20,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,554] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,554] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 4: [2022-11-27 20:41:20,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 4: [2022-11-27 20:41:20,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,564] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,569] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,569] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,571] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,571] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,571] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,571] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 7: [2022-11-27 20:41:20,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,573] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 5: [2022-11-27 20:41:20,574] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,578] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,578] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,578] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,578] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 1: [2022-11-27 20:41:20,578] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,580] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,580] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,580] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,584] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 2: [2022-11-27 20:41:20,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,585] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,585] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,586] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,587] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,587] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,587] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,589] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,589] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,590] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,592] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 3: [2022-11-27 20:41:20,592] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,592] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,593] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,594] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,595] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,595] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,596] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,596] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,599] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,600] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,600] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,600] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,600] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,600] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,600] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 0: [2022-11-27 20:41:20,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_14-model_00-model_states.pt. 6: [2022-11-27 20:41:20,601] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,601] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,601] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,602] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,603] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,605] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,606] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,606] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,616] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,619] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,625] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,627] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,628] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 2: [2022-11-27 20:41:20,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 3: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,873] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,876] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,876] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,878] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,878] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:20,878] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,879] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,881] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 4: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 4: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 1: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,883] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,883] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,883] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,884] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,884] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,884] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,884] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,884] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 6: [2022-11-27 20:41:20,885] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,885] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,885] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 5: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 0: [2022-11-27 20:41:20,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt... 7: [2022-11-27 20:41:20,891] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,891] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,891] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,896] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:20,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,897] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:20,897] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:20,898] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:20,900] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:20,901] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:20,902] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 7: [2022-11-27 20:41:20,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,912] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,912] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,915] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:20,916] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,917] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:20,917] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 2: [2022-11-27 20:41:20,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:20,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:20,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,925] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,925] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 3: [2022-11-27 20:41:20,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:20,930] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,930] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:20,934] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:20,936] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:20,936] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,937] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,937] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,937] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,939] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,940] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 1: [2022-11-27 20:41:20,940] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,943] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,944] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 5: [2022-11-27 20:41:20,944] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,945] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,946] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,950] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,950] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,950] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,951] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,951] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,951] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,951] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,956] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,958] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 6: [2022-11-27 20:41:20,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,962] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,962] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,963] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,965] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,965] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:20,966] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,966] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 0: [2022-11-27 20:41:20,966] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_15-model_00-model_states.pt. 4: [2022-11-27 20:41:20,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,968] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:20,969] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,971] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,974] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,978] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,979] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,979] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,982] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:20,983] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,988] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:20,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,191] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:21,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:21,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:21,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:21,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:21,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:21,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,217] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:21,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,219] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,223] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:21,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:21,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 1: [2022-11-27 20:41:21,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:21,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:21,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:21,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:21,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 4: [2022-11-27 20:41:21,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:21,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:21,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 7: [2022-11-27 20:41:21,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 2: [2022-11-27 20:41:21,231] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,231] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,241] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,241] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,243] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,243] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,243] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 1: [2022-11-27 20:41:21,244] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,244] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,251] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 2: [2022-11-27 20:41:21,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,257] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,260] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,260] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:21,261] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 6: [2022-11-27 20:41:21,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 0: [2022-11-27 20:41:21,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:21,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:21,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 5: [2022-11-27 20:41:21,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt... 3: [2022-11-27 20:41:21,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,266] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,271] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,271] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,276] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,277] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 6: [2022-11-27 20:41:21,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,281] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,286] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,286] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 3: [2022-11-27 20:41:21,287] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,287] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,288] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,289] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 4: [2022-11-27 20:41:21,290] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,292] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,293] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,295] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,295] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,298] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,302] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,302] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,302] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,303] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 7: [2022-11-27 20:41:21,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,304] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,304] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,305] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,308] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,309] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,311] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,312] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,313] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,313] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,313] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,317] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,318] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,323] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 0: [2022-11-27 20:41:21,326] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,328] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,328] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,330] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,331] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,336] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_16-model_00-model_states.pt. 5: [2022-11-27 20:41:21,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,347] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,348] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,356] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,486] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,490] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 6: [2022-11-27 20:41:21,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,499] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,508] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,520] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,533] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,551] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,552] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 3: [2022-11-27 20:41:21,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,563] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,564] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,565] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,566] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,567] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 6: [2022-11-27 20:41:21,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,568] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,568] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,569] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 1: [2022-11-27 20:41:21,569] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 2: [2022-11-27 20:41:21,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 1: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 4: [2022-11-27 20:41:21,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 2: [2022-11-27 20:41:21,574] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 7: [2022-11-27 20:41:21,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 5: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 7: [2022-11-27 20:41:21,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,579] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,581] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,581] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 0: [2022-11-27 20:41:21,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt... 3: [2022-11-27 20:41:21,585] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,586] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,586] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,587] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,588] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,592] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,592] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,593] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,593] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,602] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,603] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,603] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,617] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,618] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,621] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,630] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,630] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,630] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,630] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,630] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,630] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,631] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,631] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,631] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,632] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,638] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,639] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,646] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,648] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,648] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,648] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,648] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 4: [2022-11-27 20:41:21,655] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,655] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,655] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,656] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,657] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,657] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,657] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,658] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,660] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,660] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,663] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 0: [2022-11-27 20:41:21,663] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_17-model_00-model_states.pt. 5: [2022-11-27 20:41:21,667] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,674] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,675] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,686] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,687] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,761] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,762] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,763] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,766] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,766] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,769] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,769] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,769] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,769] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,795] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,809] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 2: [2022-11-27 20:41:21,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:21,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 3: [2022-11-27 20:41:21,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,828] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,828] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,828] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,828] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,828] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,830] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 6: [2022-11-27 20:41:21,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 5: [2022-11-27 20:41:21,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 1: [2022-11-27 20:41:21,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,835] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:21,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,842] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 4: [2022-11-27 20:41:21,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 0: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt... 7: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:21,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:21,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:21,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:21,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 1: [2022-11-27 20:41:21,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,862] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:21,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 7: [2022-11-27 20:41:21,867] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:21,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:21,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,875] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:21,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:21,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:21,881] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 6: [2022-11-27 20:41:21,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:21,882] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:21,883] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,883] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,883] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,883] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,884] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:21,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 2: [2022-11-27 20:41:21,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:21,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,892] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,893] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,894] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,894] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,896] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,896] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,896] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,900] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,901] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,903] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,904] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,904] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,904] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,904] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 3: [2022-11-27 20:41:21,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:21,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:21,906] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,912] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,913] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,914] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,917] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 5: [2022-11-27 20:41:21,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,920] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 0: [2022-11-27 20:41:21,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,930] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_18-model_00-model_states.pt. 4: [2022-11-27 20:41:21,930] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:21,930] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:21,934] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,937] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,950] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,953] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,955] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,955] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:21,956] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,075] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,076] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,081] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 3: [2022-11-27 20:41:22,082] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:22,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,086] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,087] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,088] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:22,089] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:22,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:22,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,090] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:22,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,091] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:22,092] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:22,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:22,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:22,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 5: [2022-11-27 20:41:22,094] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:22,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:22,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 4: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,098] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,100] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:22,100] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,102] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,103] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 2: [2022-11-27 20:41:22,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:22,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,107] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 7: [2022-11-27 20:41:22,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:22,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 0: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 1: [2022-11-27 20:41:22,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt... 6: [2022-11-27 20:41:22,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,130] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,130] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,131] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 6: [2022-11-27 20:41:22,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,152] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,157] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 3: [2022-11-27 20:41:22,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,164] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,166] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,170] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,172] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 7: [2022-11-27 20:41:22,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 4: [2022-11-27 20:41:22,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 1: [2022-11-27 20:41:22,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,178] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 5: [2022-11-27 20:41:22,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 0: [2022-11-27 20:41:22,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_19-model_00-model_states.pt. 2: [2022-11-27 20:41:22,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,381] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,387] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,388] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,392] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,393] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,405] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,409] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,412] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,413] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,418] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,432] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,434] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,437] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,446] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,447] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,447] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,453] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,458] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,459] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 0: [2022-11-27 20:41:22,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,469] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,470] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,470] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 6: [2022-11-27 20:41:22,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 6: [2022-11-27 20:41:22,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 4: [2022-11-27 20:41:22,477] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,481] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,482] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,485] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 2: [2022-11-27 20:41:22,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 7: [2022-11-27 20:41:22,489] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,491] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,495] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 3: [2022-11-27 20:41:22,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 3: [2022-11-27 20:41:22,497] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 1: [2022-11-27 20:41:22,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt... 5: [2022-11-27 20:41:22,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 7: [2022-11-27 20:41:22,507] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,508] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,513] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,526] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,534] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,535] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,536] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 2: [2022-11-27 20:41:22,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,540] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 5: [2022-11-27 20:41:22,551] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,552] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,555] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,557] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,560] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,561] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 1: [2022-11-27 20:41:22,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,568] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 4: [2022-11-27 20:41:22,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,573] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_20-model_00-model_states.pt. 0: [2022-11-27 20:41:22,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,586] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,589] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,594] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,690] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,691] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,695] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,696] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,724] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,724] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,733] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,734] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,736] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,737] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,738] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,739] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,740] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,740] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 7: [2022-11-27 20:41:22,742] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 2: [2022-11-27 20:41:22,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,743] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 4: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 5: [2022-11-27 20:41:22,744] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 6: [2022-11-27 20:41:22,746] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,746] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,748] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 1: [2022-11-27 20:41:22,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 0: [2022-11-27 20:41:22,756] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt... 3: [2022-11-27 20:41:22,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:22,768] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,770] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,771] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,772] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,773] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:22,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,773] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 3: [2022-11-27 20:41:22,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:22,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:22,778] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:22,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,781] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,786] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:22,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,788] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:22,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:22,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,792] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,794] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,797] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,798] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:22,799] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:22,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,801] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,801] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,802] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,803] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,806] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:22,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 5: [2022-11-27 20:41:22,812] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 2: [2022-11-27 20:41:22,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:22,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,816] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 7: [2022-11-27 20:41:22,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:22,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:22,817] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:22,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:22,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:22,818] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:22,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:22,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:22,819] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,823] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 6: [2022-11-27 20:41:22,823] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,824] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,825] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 4: [2022-11-27 20:41:22,825] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:22,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:22,826] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:22,828] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,829] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,830] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,830] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:22,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:22,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:22,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 0: [2022-11-27 20:41:22,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_21-model_00-model_states.pt. 1: [2022-11-27 20:41:22,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:22,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,850] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,851] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,860] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:22,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,050] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,051] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,058] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,060] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,061] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:23,062] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,064] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,065] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:23,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,068] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 5: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,070] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,072] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 3: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:23,073] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,074] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,075] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 6: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 4: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:23,077] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 2: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 7: [2022-11-27 20:41:23,078] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,079] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 0: [2022-11-27 20:41:23,080] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt... 1: [2022-11-27 20:41:23,092] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,096] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,098] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,101] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,103] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,110] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,112] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,113] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,114] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,116] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,118] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,118] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,122] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,123] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,125] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,126] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,126] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,131] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,131] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,131] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,133] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 5: [2022-11-27 20:41:23,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 6: [2022-11-27 20:41:23,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 2: [2022-11-27 20:41:23,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 1: [2022-11-27 20:41:23,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 3: [2022-11-27 20:41:23,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,144] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 0: [2022-11-27 20:41:23,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 4: [2022-11-27 20:41:23,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_22-model_00-model_states.pt. 7: [2022-11-27 20:41:23,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,161] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,169] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,178] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,317] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,319] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,319] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,319] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,320] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,320] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,323] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,325] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,340] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,341] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,342] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,344] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,346] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 6: [2022-11-27 20:41:23,363] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,366] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,368] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,368] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,378] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,379] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,381] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,382] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,383] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,383] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,384] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,385] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,386] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,387] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,388] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,389] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 6: [2022-11-27 20:41:23,390] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,395] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,396] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,400] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,400] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,406] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,408] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,409] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,414] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,414] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,417] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,417] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,418] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,419] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 7: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,421] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,422] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,422] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,423] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 7: [2022-11-27 20:41:23,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,423] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,424] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 0: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 2: [2022-11-27 20:41:23,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,426] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 4: [2022-11-27 20:41:23,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,427] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 1: [2022-11-27 20:41:23,428] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 5: [2022-11-27 20:41:23,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt... 3: [2022-11-27 20:41:23,432] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,436] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,437] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,444] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,450] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,454] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,457] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,464] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,465] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,465] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,467] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,467] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,468] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 3: [2022-11-27 20:41:23,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,473] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,474] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,474] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,478] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,481] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,482] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,483] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,485] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,488] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 0: [2022-11-27 20:41:23,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 5: [2022-11-27 20:41:23,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 1: [2022-11-27 20:41:23,497] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 4: [2022-11-27 20:41:23,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,501] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,502] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,506] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_23-model_00-model_states.pt. 2: [2022-11-27 20:41:23,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,507] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,507] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,509] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,512] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,513] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,519] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,526] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,528] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,699] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,700] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,707] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,709] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,712] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,734] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,735] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,735] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,737] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,738] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,739] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,741] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,745] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,746] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,747] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,749] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,749] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,750] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,751] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,752] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,753] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,754] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,755] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,757] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,758] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 6: [2022-11-27 20:41:23,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,760] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:23,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:23,764] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,768] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,779] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,785] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,785] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,786] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,787] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:23,789] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,790] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 6: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,791] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,793] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,795] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,796] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,796] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,798] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 3: [2022-11-27 20:41:23,800] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,802] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,807] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:23,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:23,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:23,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,819] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 3: [2022-11-27 20:41:23,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:23,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,822] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 1: [2022-11-27 20:41:23,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,827] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 7: [2022-11-27 20:41:23,831] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 2: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,839] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 7: [2022-11-27 20:41:23,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,843] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,845] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 1: [2022-11-27 20:41:23,846] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:23,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:23,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 0: [2022-11-27 20:41:23,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,874] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 4: [2022-11-27 20:41:23,876] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt... 5: [2022-11-27 20:41:23,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,886] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,886] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:23,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:23,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,893] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,893] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,895] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,895] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,895] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 0: [2022-11-27 20:41:23,898] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:23,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:23,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:23,902] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,904] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,906] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,914] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 2: [2022-11-27 20:41:23,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:23,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,923] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,936] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 5: [2022-11-27 20:41:23,936] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_24-model_00-model_states.pt. 4: [2022-11-27 20:41:23,941] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:23,943] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,958] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,959] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,961] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:23,964] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,095] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,104] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,105] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,110] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,111] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,120] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,122] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,123] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,124] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,125] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,126] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,127] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:24,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:24,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:24,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:24,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:24,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:24,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 2: [2022-11-27 20:41:24,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,158] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,159] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,164] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,165] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,166] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:24,166] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:24,167] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,168] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,168] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:24,170] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,170] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,170] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,170] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 3: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 1: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,173] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,174] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,178] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:24,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,183] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 6: [2022-11-27 20:41:24,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:24,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:24,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:24,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 5: [2022-11-27 20:41:24,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,187] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,191] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,192] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,193] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 5: [2022-11-27 20:41:24,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 6: [2022-11-27 20:41:24,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,204] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,204] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,216] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 2: [2022-11-27 20:41:24,217] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:24,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:24,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:24,223] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 0: [2022-11-27 20:41:24,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,223] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:24,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:24,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 4: [2022-11-27 20:41:24,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt... 7: [2022-11-27 20:41:24,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,227] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,227] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,233] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,234] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,237] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,238] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,238] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,238] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,243] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,246] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,247] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,249] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 0: [2022-11-27 20:41:24,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 3: [2022-11-27 20:41:24,253] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,253] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,253] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 1: [2022-11-27 20:41:24,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,261] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,269] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 7: [2022-11-27 20:41:24,273] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,282] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,283] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,284] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,288] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,300] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_25-model_00-model_states.pt. 4: [2022-11-27 20:41:24,303] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,306] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,308] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,312] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,320] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,425] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,430] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,433] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,438] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,439] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,439] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,440] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,441] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,442] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,443] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,443] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,443] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,444] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,454] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,455] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,456] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,459] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,460] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,460] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,461] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,462] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,463] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,463] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,466] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,466] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,468] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,469] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,472] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,473] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,475] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,478] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 5: [2022-11-27 20:41:24,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,480] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,483] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 1: [2022-11-27 20:41:24,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 7: [2022-11-27 20:41:24,484] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,486] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,488] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,489] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,490] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,491] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,492] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,492] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,493] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,494] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 3: [2022-11-27 20:41:24,496] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 0: [2022-11-27 20:41:24,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,500] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,501] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,504] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,504] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,505] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,505] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,506] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,510] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,511] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,516] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,517] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,517] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,518] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,518] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 6: [2022-11-27 20:41:24,520] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,521] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 6: [2022-11-27 20:41:24,521] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 7: [2022-11-27 20:41:24,522] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,523] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 4: [2022-11-27 20:41:24,524] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt... 2: [2022-11-27 20:41:24,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,525] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 5: [2022-11-27 20:41:24,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,525] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,526] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,526] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,532] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,536] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 2: [2022-11-27 20:41:24,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,539] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,540] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,545] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,548] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,553] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 1: [2022-11-27 20:41:24,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,559] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,559] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,559] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 0: [2022-11-27 20:41:24,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,566] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,567] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,568] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,571] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 3: [2022-11-27 20:41:24,571] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,578] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,579] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,585] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,585] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,586] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,597] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_26-model_00-model_states.pt. 4: [2022-11-27 20:41:24,607] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,607] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,613] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,618] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,758] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,760] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,761] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,767] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,781] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,804] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,804] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,805] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,805] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,807] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,808] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,808] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,809] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,810] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,811] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,812] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,813] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,814] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,814] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 3: [2022-11-27 20:41:24,815] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 5: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,816] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 2: [2022-11-27 20:41:24,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,818] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,820] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,821] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 1: [2022-11-27 20:41:24,822] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,823] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,826] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,834] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:24,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,835] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:24,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,846] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,852] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,854] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,855] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:24,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 6: [2022-11-27 20:41:24,858] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:24,859] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,860] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 2: [2022-11-27 20:41:24,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,861] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:24,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:24,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 0: [2022-11-27 20:41:24,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,863] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 4: [2022-11-27 20:41:24,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt... 7: [2022-11-27 20:41:24,865] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,869] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,870] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 0: [2022-11-27 20:41:24,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,873] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,876] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,876] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,876] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 3: [2022-11-27 20:41:24,879] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,879] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 6: [2022-11-27 20:41:24,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:24,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 1: [2022-11-27 20:41:24,880] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:24,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:24,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:24,889] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,890] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,890] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:24,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,892] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,893] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,893] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,893] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:24,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,895] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:24,896] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 7: [2022-11-27 20:41:24,896] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,897] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 5: [2022-11-27 20:41:24,900] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,900] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,904] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:24,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,911] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:24,913] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,915] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,916] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,921] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,926] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,938] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_27-model_00-model_states.pt. 4: [2022-11-27 20:41:24,941] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,942] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,944] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,949] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:24,957] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:25,134] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,135] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:25,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,139] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 7: [2022-11-27 20:41:25,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:25,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:25,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:25,143] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:25,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:25,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,144] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 6: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,145] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,146] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 5: [2022-11-27 20:41:25,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:25,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 4: [2022-11-27 20:41:25,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,148] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,148] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 0: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,149] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,150] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,151] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 2: [2022-11-27 20:41:25,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,154] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 3: [2022-11-27 20:41:25,155] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt... 1: [2022-11-27 20:41:25,169] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,185] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,187] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,190] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,190] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,194] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,196] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,196] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,196] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,198] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,204] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 1: [2022-11-27 20:41:25,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 2: [2022-11-27 20:41:25,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,210] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,210] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 4: [2022-11-27 20:41:25,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 7: [2022-11-27 20:41:25,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,216] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 5: [2022-11-27 20:41:25,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 0: [2022-11-27 20:41:25,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 3: [2022-11-27 20:41:25,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_28-model_00-model_states.pt. 6: [2022-11-27 20:41:25,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,227] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,233] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,233] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,238] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,244] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,245] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,247] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,247] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,249] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,503] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,503] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,510] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,516] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,530] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,538] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,539] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,540] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,541] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,542] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,543] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,544] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,545] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,546] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,547] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 0: [2022-11-27 20:41:25,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,568] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,569] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,572] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 5: [2022-11-27 20:41:25,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,575] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 7: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 2: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,578] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 4: [2022-11-27 20:41:25,579] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,581] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 3: [2022-11-27 20:41:25,581] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,582] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,583] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,583] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 1: [2022-11-27 20:41:25,584] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt... 6: [2022-11-27 20:41:25,586] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,589] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,589] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,590] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,592] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,593] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,594] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,594] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:25,595] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,596] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,597] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 3: [2022-11-27 20:41:25,597] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,598] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,600] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,600] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,600] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,601] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,601] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,604] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,605] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,606] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 4: [2022-11-27 20:41:25,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,610] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,610] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,611] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,615] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,615] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,615] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,617] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,617] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,617] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 5: [2022-11-27 20:41:25,619] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,621] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,622] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,624] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,626] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,626] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,627] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,628] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,628] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,628] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,628] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 2: [2022-11-27 20:41:25,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 6: [2022-11-27 20:41:25,630] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,630] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,630] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:25,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,631] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,632] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,632] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,633] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,633] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:25,633] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,633] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,633] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:25,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,635] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,635] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:25,635] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:25,635] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:25,638] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 1: [2022-11-27 20:41:25,642] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,646] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,647] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,648] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,648] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 0: [2022-11-27 20:41:25,648] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_29-model_00-model_states.pt. 7: [2022-11-27 20:41:25,649] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,649] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,649] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,651] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,651] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,652] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,653] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,655] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,657] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,659] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:25,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,667] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,667] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,669] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,671] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,671] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,784] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,784] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,788] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,789] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,791] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,792] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,793] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 3: [2022-11-27 20:41:25,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,834] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,836] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,845] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,853] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:25,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:25,855] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:25,856] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 3: [2022-11-27 20:41:25,859] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:25,869] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:25,872] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:25,873] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:25,877] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:25,913] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,913] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,922] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,925] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,926] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,927] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,928] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,929] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,932] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,935] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,935] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,935] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,935] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:25,935] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,935] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,936] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,952] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:25,954] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,956] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,958] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,959] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,960] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,964] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,964] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,964] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,965] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:25,966] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,967] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,970] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,970] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,971] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,971] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,972] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,972] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,973] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,973] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,975] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,977] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,978] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,978] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,978] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,978] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:25,978] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 0: [2022-11-27 20:41:25,980] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:25,981] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,982] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,982] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,983] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,985] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,985] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,985] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,985] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,986] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 6: [2022-11-27 20:41:25,987] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:25,988] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:25,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:25,989] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:25,990] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:25,990] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:25,997] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:25,998] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:25,999] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,004] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,006] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:26,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:26,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:26,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:26,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:26,006] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,007] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,007] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 6: [2022-11-27 20:41:26,009] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,009] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,010] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:26,010] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,011] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:26,011] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,014] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,015] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:26,015] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:26,015] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:26,016] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,016] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,016] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:26,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 4: [2022-11-27 20:41:26,017] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 2: [2022-11-27 20:41:26,019] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:26,019] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,021] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,022] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,024] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:26,025] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:26,025] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:26,025] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 5: [2022-11-27 20:41:26,026] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,026] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:26,027] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:26,027] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 7: [2022-11-27 20:41:26,027] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt... 1: [2022-11-27 20:41:26,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,027] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,030] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,031] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:26,031] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,033] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,033] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,033] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:26,034] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:26,034] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 2: [2022-11-27 20:41:26,034] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 0: [2022-11-27 20:41:26,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,035] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,036] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,036] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,037] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,037] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,038] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,042] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,042] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,044] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,045] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,046] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,046] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,046] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,048] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,050] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,051] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,052] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 1: [2022-11-27 20:41:26,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,053] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,055] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,056] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,057] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,060] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,060] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,065] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,066] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,066] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,067] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,067] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,068] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,069] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 5: [2022-11-27 20:41:26,069] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,071] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,072] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,076] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,077] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,078] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,082] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,083] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,084] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,085] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 7: [2022-11-27 20:41:26,085] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_30-model_00-model_states.pt. 4: [2022-11-27 20:41:26,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,087] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,090] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,093] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,095] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,099] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,102] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,104] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,105] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,106] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,109] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,128] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,132] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,135] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,138] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,192] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 3: [2022-11-27 20:41:26,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,205] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,206] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 3: [2022-11-27 20:41:26,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,219] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,227] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,239] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,253] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,253] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,254] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 1: [2022-11-27 20:41:26,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,269] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,273] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,274] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,275] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,277] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,277] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,279] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,280] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 0: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,281] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 6: [2022-11-27 20:41:26,286] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,286] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,286] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,286] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,287] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,288] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,288] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 6: [2022-11-27 20:41:26,292] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,294] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,298] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,300] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,307] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,309] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,310] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,310] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,310] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,311] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,312] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,312] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,313] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,313] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,314] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,315] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,316] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,318] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,318] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 1: [2022-11-27 20:41:26,321] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,330] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,331] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,331] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,332] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,332] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 2: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,333] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,334] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,334] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,335] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,335] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,336] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,336] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 2: [2022-11-27 20:41:26,337] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,338] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 7: [2022-11-27 20:41:26,339] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 4: [2022-11-27 20:41:26,343] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt... 5: [2022-11-27 20:41:26,344] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,344] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,347] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,354] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,357] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,358] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,358] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,359] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 5: [2022-11-27 20:41:26,367] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,369] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,371] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,372] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,376] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,376] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,377] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,378] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,379] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,379] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,379] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,379] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 0: [2022-11-27 20:41:26,380] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,380] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,390] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,391] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,392] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,396] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,397] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,397] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,398] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,398] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,399] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,404] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,407] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 4: [2022-11-27 20:41:26,408] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,411] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,411] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_31-model_00-model_states.pt. 7: [2022-11-27 20:41:26,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,413] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,419] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,420] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,425] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,429] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,430] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,431] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,451] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,452] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,453] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,456] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,457] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,458] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,484] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,484] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,494] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,495] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,498] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,499] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,500] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,507] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,511] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,514] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,514] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,515] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,527] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,529] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,529] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,530] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,531] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,532] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,533] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,535] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,537] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 3: [2022-11-27 20:41:26,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,537] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 3: [2022-11-27 20:41:26,538] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,541] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,542] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,544] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,546] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,547] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,548] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,549] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,549] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,550] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,553] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,554] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,554] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,555] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,556] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,556] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,557] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,558] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,559] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,560] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,562] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 1: [2022-11-27 20:41:26,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,563] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,564] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,568] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,570] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,570] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,572] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,573] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,574] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,574] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,574] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,574] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 6: [2022-11-27 20:41:26,574] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,574] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,574] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,575] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,576] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,577] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,578] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,578] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,578] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,579] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,579] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 6: [2022-11-27 20:41:26,579] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,579] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 4: [2022-11-27 20:41:26,580] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 5: [2022-11-27 20:41:26,580] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 0: [2022-11-27 20:41:26,582] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,586] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,587] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,591] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,591] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,593] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,597] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,599] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,600] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,605] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,604] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,605] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,606] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,607] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,608] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,609] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,611] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,611] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,611] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,612] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,613] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,613] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,613] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,614] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 7: [2022-11-27 20:41:26,614] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt... 2: [2022-11-27 20:41:26,614] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,614] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 2: [2022-11-27 20:41:26,615] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,616] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,617] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,617] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,617] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 5: [2022-11-27 20:41:26,618] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,621] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,623] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,624] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,629] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,630] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,633] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,634] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,634] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,636] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,637] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,642] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,644] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,650] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,653] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,653] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,655] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 1: [2022-11-27 20:41:26,654] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,655] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,656] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,656] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,656] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 0: [2022-11-27 20:41:26,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,664] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,665] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,669] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,669] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 7: [2022-11-27 20:41:26,669] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_32-model_00-model_states.pt. 4: [2022-11-27 20:41:26,671] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,672] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,673] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,676] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,677] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,677] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,679] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,680] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,688] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,689] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,689] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,690] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,693] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,772] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,774] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,775] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,775] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,776] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,777] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,780] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,820] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,821] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,829] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,830] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,831] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,832] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,833] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,833] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,836] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,837] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:26,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,837] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,838] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,839] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,840] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:26,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,840] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,841] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,842] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,842] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,842] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,843] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,844] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 5: [2022-11-27 20:41:26,847] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,848] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 0: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,849] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,853] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:26,854] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:26,856] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:26,857] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 6: [2022-11-27 20:41:26,858] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,861] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,862] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,863] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 6: [2022-11-27 20:41:26,864] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,864] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,868] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,870] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 1: [2022-11-27 20:41:26,871] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,871] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 2: [2022-11-27 20:41:26,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,872] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,874] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:26,877] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,880] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,882] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,887] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:26,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,887] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,888] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:26,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,888] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,889] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,890] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,891] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,891] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:26,894] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:26,896] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,898] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,899] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,904] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:26,905] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,905] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:26,906] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,907] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,907] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,907] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:26,908] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:26,908] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,909] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:26,909] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,909] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,910] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 2: [2022-11-27 20:41:26,910] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:26,911] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 1: [2022-11-27 20:41:26,911] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,912] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,913] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:26,914] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:26,915] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,916] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,916] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,916] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,916] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,916] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,916] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,917] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,918] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,919] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,922] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,922] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,922] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 3: [2022-11-27 20:41:26,924] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:26,924] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 7: [2022-11-27 20:41:26,925] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt... 4: [2022-11-27 20:41:26,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,927] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,928] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 3: [2022-11-27 20:41:26,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,928] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,928] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 5: [2022-11-27 20:41:26,929] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,931] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:26,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:26,933] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,934] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:26,935] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,937] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,937] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:26,938] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,939] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,939] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:26,940] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:26,941] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:26,944] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:26,945] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,946] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 0: [2022-11-27 20:41:26,947] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:26,948] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:26,948] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,949] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,949] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,951] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,956] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:26,957] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,963] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,964] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,964] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 4: [2022-11-27 20:41:26,965] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,975] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,979] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,979] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,979] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,981] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,982] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,983] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:26,991] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_33-model_00-model_states.pt. 7: [2022-11-27 20:41:26,997] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:27,000] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:27,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:27,002] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:27,014] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,097] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,097] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,106] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,108] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,108] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,112] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,115] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,115] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,119] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,119] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,128] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,129] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,132] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,133] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,134] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,136] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,136] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,137] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,137] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,138] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,139] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,140] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,140] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,141] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,141] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,142] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,143] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,145] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:27,147] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:27,147] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:27,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,150] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,152] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:27,153] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:27,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,153] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,154] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,155] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,156] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:27,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,156] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:27,157] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,158] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 4: [2022-11-27 20:41:27,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,159] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,160] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,160] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,162] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,162] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,163] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:27,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:27,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:27,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 0: [2022-11-27 20:41:27,163] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:27,171] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,173] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,175] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 1: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,176] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:27,177] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,179] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,180] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:27,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:27,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 5: [2022-11-27 20:41:27,179] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,180] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 1: [2022-11-27 20:41:27,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,181] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,181] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,182] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,183] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,184] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,184] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:27,185] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,186] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,186] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 2: [2022-11-27 20:41:27,188] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,188] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 3: [2022-11-27 20:41:27,189] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:27,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 7: [2022-11-27 20:41:27,189] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt... 6: [2022-11-27 20:41:27,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,191] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,193] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,194] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,195] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,195] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,197] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,197] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,198] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,199] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,199] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,199] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,200] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,200] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,201] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,202] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,203] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,204] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,205] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,206] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 1: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,207] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 6: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,208] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,209] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,209] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,210] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,210] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,211] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,211] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,212] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,212] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,213] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 6: [2022-11-27 20:41:27,213] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 6: [2022-11-27 20:41:27,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,214] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,214] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 1: [2022-11-27 20:41:27,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-27 20:41:27,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-27 20:41:27,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,215] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,215] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,217] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,218] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 6: [2022-11-27 20:41:27,218] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,219] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 1: [2022-11-27 20:41:27,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,219] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,219] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,220] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,220] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 6: [2022-11-27 20:41:27,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,221] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,221] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,222] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 6: [2022-11-27 20:41:27,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,222] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,223] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,223] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,223] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,224] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,224] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 6: [2022-11-27 20:41:27,225] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,225] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 2: [2022-11-27 20:41:27,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,226] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,226] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,227] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,227] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,228] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,228] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,229] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,229] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,229] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,229] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,230] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,230] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,230] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,230] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,231] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,231] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,231] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,231] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,232] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 3: [2022-11-27 20:41:27,234] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,234] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,234] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,235] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 5: [2022-11-27 20:41:27,235] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,236] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 3: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,237] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 0: [2022-11-27 20:41:27,238] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,238] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 3: [2022-11-27 20:41:27,238] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 0: [2022-11-27 20:41:27,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,239] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,239] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 2: [2022-11-27 20:41:27,239] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 2: [2022-11-27 20:41:27,240] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,240] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,240] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,240] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,241] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,241] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,241] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 4: [2022-11-27 20:41:27,242] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 2: [2022-11-27 20:41:27,243] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 0: [2022-11-27 20:41:27,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 7: [2022-11-27 20:41:27,244] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_34-model_00-model_states.pt. 0: [2022-11-27 20:41:27,246] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,246] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,246] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,247] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,247] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,247] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,247] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,248] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,248] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,248] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,249] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,249] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,250] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,250] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,250] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,250] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,251] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,252] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,253] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,253] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,253] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,255] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,255] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,255] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,256] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,256] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 4: [2022-11-27 20:41:27,256] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 0: [2022-11-27 20:41:27,257] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 4: [2022-11-27 20:41:27,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 0: [2022-11-27 20:41:27,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,257] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,258] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,258] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,259] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,259] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,260] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,260] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,260] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,261] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,262] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,262] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,263] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,263] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,263] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,264] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,264] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,265] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 5: [2022-11-27 20:41:27,265] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 5: [2022-11-27 20:41:27,266] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,266] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,267] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,267] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,267] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,267] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,267] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,268] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,268] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,269] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,269] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,269] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 7: [2022-11-27 20:41:27,270] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,271] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,271] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: [2022-11-27 20:41:27,271] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,271] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 7: [2022-11-27 20:41:27,271] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt... 0: > using checkpoint value 0.0002 for learning rate 0: > using checkpoint value 2e-05 for minimum learning rate 0: > using checkpoint value 225657 for warmup iterations 0: > using checkpoint value 22565693 for total number of iterations 0: > using checkpoint value cosine for decay style 7: [2022-11-27 20:41:27,272] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/layer_36-model_00-model_states.pt. 0: [2022-11-27 20:41:27,272] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-27 20:41:27,274] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-27 20:41:27,276] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 0: [2022-11-27 20:41:27,278] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-27 20:41:27,282] [INFO] [torch_checkpoint_engine.py:21:load] [Torch] Loading checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 3: [2022-11-27 20:41:28,475] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,475] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 28 1: [2022-11-27 20:41:28,476] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,476] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 8 7: [2022-11-27 20:41:28,477] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,477] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 56 4: [2022-11-27 20:41:28,487] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,487] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 33 1: [2022-11-27 20:41:28,498] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,499] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 10 0: [2022-11-27 20:41:28,515] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,516] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 5 2: [2022-11-27 20:41:28,531] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,532] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 22 6: [2022-11-27 20:41:28,540] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,540] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 50 3: [2022-11-27 20:41:28,543] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,543] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 24 4: [2022-11-27 20:41:28,550] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,550] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 37 1: [2022-11-27 20:41:28,561] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,561] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 14 7: [2022-11-27 20:41:28,562] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,562] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 63 1: [2022-11-27 20:41:28,568] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 8 7: [2022-11-27 20:41:28,572] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 56 6: [2022-11-27 20:41:28,577] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,577] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 54 3: [2022-11-27 20:41:28,585] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 28 7: [2022-11-27 20:41:28,588] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,588] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 58 2: [2022-11-27 20:41:28,589] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,589] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 23 0: [2022-11-27 20:41:28,593] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 5 5: [2022-11-27 20:41:28,604] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,605] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 40 6: [2022-11-27 20:41:28,605] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,605] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,606] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 48 6: [2022-11-27 20:41:28,606] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 49 5: [2022-11-27 20:41:28,606] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,606] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 42 5: [2022-11-27 20:41:28,619] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,620] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 44 7: [2022-11-27 20:41:28,627] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,627] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 57 4: [2022-11-27 20:41:28,631] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,631] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 34 2: [2022-11-27 20:41:28,633] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,633] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 16 2: [2022-11-27 20:41:28,635] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,635] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 21 7: [2022-11-27 20:41:28,638] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,638] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 62 0: [2022-11-27 20:41:28,641] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,641] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 3 2: [2022-11-27 20:41:28,642] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 22 5: [2022-11-27 20:41:28,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,643] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 47 6: [2022-11-27 20:41:28,643] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,643] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 52 3: [2022-11-27 20:41:28,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,654] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,654] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 31 5: [2022-11-27 20:41:28,654] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 43 1: [2022-11-27 20:41:28,666] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 10 3: [2022-11-27 20:41:28,667] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,667] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,667] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 30 2: [2022-11-27 20:41:28,667] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 20 7: [2022-11-27 20:41:28,674] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,674] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 60 7: [2022-11-27 20:41:28,674] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,675] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 61 3: [2022-11-27 20:41:28,680] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,680] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 25 1: [2022-11-27 20:41:28,686] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,686] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 12 2: [2022-11-27 20:41:28,687] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,687] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 18 1: [2022-11-27 20:41:28,688] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,688] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 13 6: [2022-11-27 20:41:28,688] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 50 0: [2022-11-27 20:41:28,689] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,689] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 0 5: [2022-11-27 20:41:28,705] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,705] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 45 5: [2022-11-27 20:41:28,706] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,706] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 41 6: [2022-11-27 20:41:28,708] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,708] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 55 4: [2022-11-27 20:41:28,710] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,710] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 38 4: [2022-11-27 20:41:28,711] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 37 1: [2022-11-27 20:41:28,713] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,714] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 9 3: [2022-11-27 20:41:28,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,715] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 29 3: [2022-11-27 20:41:28,715] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,715] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 27 3: [2022-11-27 20:41:28,716] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-27 20:41:28,716] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 26 0: [2022-11-27 20:41:28,723] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,724] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 6 7: [2022-11-27 20:41:28,727] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-27 20:41:28,727] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 59 0: [2022-11-27 20:41:28,742] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,742] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 7 3: [2022-11-27 20:41:28,743] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 24 0: [2022-11-27 20:41:28,752] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,753] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 1 2: [2022-11-27 20:41:28,754] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,754] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 17 0: [2022-11-27 20:41:28,755] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,755] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 2 2: [2022-11-27 20:41:28,761] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 23 5: [2022-11-27 20:41:28,767] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-27 20:41:28,767] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 46 1: [2022-11-27 20:41:28,769] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,770] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 15 4: [2022-11-27 20:41:28,779] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 33 4: [2022-11-27 20:41:28,780] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,780] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 39 4: [2022-11-27 20:41:28,782] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 34 0: [2022-11-27 20:41:28,813] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 3 5: [2022-11-27 20:41:28,813] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 43 4: [2022-11-27 20:41:28,813] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,813] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 36 6: [2022-11-27 20:41:28,817] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,817] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 53 1: [2022-11-27 20:41:28,821] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 14 6: [2022-11-27 20:41:28,832] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-27 20:41:28,832] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 51 7: [2022-11-27 20:41:28,843] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 58 3: [2022-11-27 20:41:28,844] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 30 1: [2022-11-27 20:41:28,853] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 13 2: [2022-11-27 20:41:28,860] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 20 0: [2022-11-27 20:41:28,867] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-27 20:41:28,867] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 4 4: [2022-11-27 20:41:28,868] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,868] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 35 4: [2022-11-27 20:41:28,869] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-27 20:41:28,869] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 32 1: [2022-11-27 20:41:28,878] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 12 4: [2022-11-27 20:41:28,882] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 38 6: [2022-11-27 20:41:28,885] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 48 2: [2022-11-27 20:41:28,886] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-27 20:41:28,886] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 19 7: [2022-11-27 20:41:28,898] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 59 3: [2022-11-27 20:41:28,906] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 29 5: [2022-11-27 20:41:28,919] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 47 6: [2022-11-27 20:41:28,919] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 49 7: [2022-11-27 20:41:28,920] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 57 0: [2022-11-27 20:41:28,928] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 0 0: checkpoint version 3.0 6: [2022-11-27 20:41:28,939] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 55 6: [2022-11-27 20:41:28,940] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 52 3: [2022-11-27 20:41:28,955] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 31 2: [2022-11-27 20:41:28,959] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 18 1: [2022-11-27 20:41:28,961] [INFO] [torch_checkpoint_engine.py:23:load] [Torch] Loaded checkpoint from checkpoints_2b2/global_step37000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-27 20:41:28,961] [INFO] [engine.py:2844:_get_all_zero_checkpoint_state_dicts] successfully read 64 ZeRO state_dicts for rank 11 1: [2022-11-27 20:41:28,991] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 9 0: [2022-11-27 20:41:29,010] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 6 0: [2022-11-27 20:41:29,029] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 7 7: [2022-11-27 20:41:29,031] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 63 5: [2022-11-27 20:41:29,044] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 46 4: [2022-11-27 20:41:29,064] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 39 7: [2022-11-27 20:41:29,071] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 62 7: [2022-11-27 20:41:29,073] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 61 5: [2022-11-27 20:41:29,074] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 40 3: [2022-11-27 20:41:29,076] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 25 6: [2022-11-27 20:41:29,082] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 51 4: [2022-11-27 20:41:29,088] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 32 5: [2022-11-27 20:41:29,090] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 42 0: [2022-11-27 20:41:29,102] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 2 0: [2022-11-27 20:41:29,109] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 4 3: [2022-11-27 20:41:29,112] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 26 0: [2022-11-27 20:41:29,114] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 1 5: [2022-11-27 20:41:29,115] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 44 3: [2022-11-27 20:41:29,115] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 27 2: [2022-11-27 20:41:29,116] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 17 5: [2022-11-27 20:41:29,116] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 45 1: [2022-11-27 20:41:29,129] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 15 4: [2022-11-27 20:41:29,134] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 36 1: [2022-11-27 20:41:29,134] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 11 2: [2022-11-27 20:41:29,134] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 21 6: [2022-11-27 20:41:29,140] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 54 6: [2022-11-27 20:41:29,220] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 53 5: [2022-11-27 20:41:29,243] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 41 7: [2022-11-27 20:41:29,245] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 60 4: [2022-11-27 20:41:29,270] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 35 2: [2022-11-27 20:41:29,277] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 19 2: [2022-11-27 20:41:29,456] [INFO] [engine.py:2784:_load_zero_checkpoint] loading 64 zero partition checkpoints for rank 16 0: successfully loaded checkpoint from checkpoints_2b2 at iteration 37000 7: time (ms) | load-checkpoint: 13681.51 0: estimated model parameters: 2.160013824 0: estimated model parameters without embeddings: 2.039394816 0: [after model, optimizer, and learning rate scheduler are built] datetime: 2022-11-27 20:41:29 0: > building train, validation, and test datasets ... 0: > datasets target sizes (minimum size): 0: train: 22565693 0: validation: 23040 0: test: 512 0: > building train, validation, and test datasets for GPT ... 0: > building dataset index ... 0: reading sizes... 0: reading pointers... 0: reading document index... 0: creating numpy buffer of mmap... 0: creating memory view of numpy buffer... 0: > finished creating indexed dataset in 0.034532 seconds 0: number of documents: 210604984 0: > dataset split: 0: train: 0: document indices in [0, 199864130) total of 199864130 documents 0: validation: 0: document indices in [199864130, 210394379) total of 10530249 documents 0: test: 0: document indices in [210394379, 210604984) total of 210605 documents 0: > loading doc-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_train_indexmap_22565693ns_2048sl_1234s_doc_idx.npy 0: > loading sample-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_train_indexmap_22565693ns_2048sl_1234s_sample_idx.npy 0: > loading shuffle-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_train_indexmap_22565693ns_2048sl_1234s_shuffle_idx.npy 0: loaded indexed file in 0.095 seconds 0: total number of samples: 173377817 0: total number of epochs: 1 0: > loading doc-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_valid_indexmap_23040ns_2048sl_1234s_doc_idx.npy 0: > loading sample-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_valid_indexmap_23040ns_2048sl_1234s_sample_idx.npy 0: > loading shuffle-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_valid_indexmap_23040ns_2048sl_1234s_shuffle_idx.npy 0: loaded indexed file in 0.082 seconds 0: total number of samples: 9118345 0: total number of epochs: 1 0: > loading doc-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_test_indexmap_512ns_2048sl_1234s_doc_idx.npy 0: > loading sample-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_test_indexmap_512ns_2048sl_1234s_sample_idx.npy 0: > loading shuffle-idx mapping from /scratch/project_462000119/data/pile/megatron_data/meg-gpt2_pile_text_document_test_indexmap_512ns_2048sl_1234s_shuffle_idx.npy 0: loaded indexed file in 0.061 seconds 0: total number of samples: 182928 0: total number of epochs: 1 0: > finished creating GPT datasets ... 0: [after dataloaders are built] datetime: 2022-11-27 20:41:50 0: done with setup ... 0: training ... 0: Number of parameters: [tensor rank - pipeline rank] w/ and w/o embeddings: 7: time (ms) | model-and-optimizer-setup: 29590.15 | train/valid/test-data-iterators-setup: 19922.79 0: [000-000] 2.1600B / 2.0394B 0: [before the start of training step] datetime: 2022-11-27 20:41:50 0: [Rank 0] (after 37010 iterations) memory (MB) | allocated: 18449.37548828125 | max allocated: 52514.84375 | reserved: 59010.0 | max reserved: 59010.0 7: iteration 37010/ 44073 | consumed samples: 18949120 | consumed tokens: 38807797760 | elapsed time per iteration (s): 6.01 | learning rate: 3.139E-05 | global batch size: 512 | lm loss: 1.919727E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 85.208 | TFLOPs: 39.71 | 7: iteration 37020/ 44073 | consumed samples: 18954240 | consumed tokens: 38818283520 | elapsed time per iteration (s): 4.23 | learning rate: 3.136E-05 | global batch size: 512 | lm loss: 1.925326E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.005 | TFLOPs: 56.39 | 7: iteration 37030/ 44073 | consumed samples: 18959360 | consumed tokens: 38828769280 | elapsed time per iteration (s): 4.17 | learning rate: 3.133E-05 | global batch size: 512 | lm loss: 1.932592E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.804 | TFLOPs: 57.23 | 7: iteration 37040/ 44073 | consumed samples: 18964480 | consumed tokens: 38839255040 | elapsed time per iteration (s): 4.24 | learning rate: 3.130E-05 | global batch size: 512 | lm loss: 1.918813E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.763 | TFLOPs: 56.28 | 7: iteration 37050/ 44073 | consumed samples: 18969600 | consumed tokens: 38849740800 | elapsed time per iteration (s): 4.23 | learning rate: 3.126E-05 | global batch size: 512 | lm loss: 1.952920E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.113 | TFLOPs: 56.44 | 7: iteration 37060/ 44073 | consumed samples: 18974720 | consumed tokens: 38860226560 | elapsed time per iteration (s): 4.25 | learning rate: 3.123E-05 | global batch size: 512 | lm loss: 1.916362E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.379 | TFLOPs: 56.10 | 7: iteration 37070/ 44073 | consumed samples: 18979840 | consumed tokens: 38870712320 | elapsed time per iteration (s): 4.22 | learning rate: 3.120E-05 | global batch size: 512 | lm loss: 1.918458E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.277 | TFLOPs: 56.52 | 7: iteration 37080/ 44073 | consumed samples: 18984960 | consumed tokens: 38881198080 | elapsed time per iteration (s): 4.27 | learning rate: 3.117E-05 | global batch size: 512 | lm loss: 1.935337E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.973 | TFLOPs: 55.91 | 7: iteration 37090/ 44073 | consumed samples: 18990080 | consumed tokens: 38891683840 | elapsed time per iteration (s): 4.29 | learning rate: 3.114E-05 | global batch size: 512 | lm loss: 1.938613E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.399 | TFLOPs: 55.65 | 7: iteration 37100/ 44073 | consumed samples: 18995200 | consumed tokens: 38902169600 | elapsed time per iteration (s): 4.25 | learning rate: 3.111E-05 | global batch size: 512 | lm loss: 1.930285E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.401 | TFLOPs: 56.11 | 7: iteration 37110/ 44073 | consumed samples: 19000320 | consumed tokens: 38912655360 | elapsed time per iteration (s): 4.20 | learning rate: 3.108E-05 | global batch size: 512 | lm loss: 1.947735E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.777 | TFLOPs: 56.75 | 7: iteration 37120/ 44073 | consumed samples: 19005440 | consumed tokens: 38923141120 | elapsed time per iteration (s): 4.20 | learning rate: 3.105E-05 | global batch size: 512 | lm loss: 1.931773E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.805 | TFLOPs: 56.77 | 7: iteration 37130/ 44073 | consumed samples: 19010560 | consumed tokens: 38933626880 | elapsed time per iteration (s): 4.22 | learning rate: 3.102E-05 | global batch size: 512 | lm loss: 1.927090E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.366 | TFLOPs: 56.56 | 7: iteration 37140/ 44073 | consumed samples: 19015680 | consumed tokens: 38944112640 | elapsed time per iteration (s): 4.21 | learning rate: 3.098E-05 | global batch size: 512 | lm loss: 1.942301E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.585 | TFLOPs: 56.66 | 7: iteration 37150/ 44073 | consumed samples: 19020800 | consumed tokens: 38954598400 | elapsed time per iteration (s): 4.20 | learning rate: 3.095E-05 | global batch size: 512 | lm loss: 1.933177E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.778 | TFLOPs: 56.75 | 7: iteration 37160/ 44073 | consumed samples: 19025920 | consumed tokens: 38965084160 | elapsed time per iteration (s): 4.19 | learning rate: 3.092E-05 | global batch size: 512 | lm loss: 1.923754E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.101 | TFLOPs: 56.91 | 7: iteration 37170/ 44073 | consumed samples: 19031040 | consumed tokens: 38975569920 | elapsed time per iteration (s): 4.21 | learning rate: 3.089E-05 | global batch size: 512 | lm loss: 1.919008E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.685 | TFLOPs: 56.71 | 7: iteration 37180/ 44073 | consumed samples: 19036160 | consumed tokens: 38986055680 | elapsed time per iteration (s): 4.18 | learning rate: 3.086E-05 | global batch size: 512 | lm loss: 1.941547E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.386 | TFLOPs: 57.04 | 7: iteration 37190/ 44073 | consumed samples: 19041280 | consumed tokens: 38996541440 | elapsed time per iteration (s): 4.24 | learning rate: 3.083E-05 | global batch size: 512 | lm loss: 1.918682E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.861 | TFLOPs: 56.33 | 7: iteration 37200/ 44073 | consumed samples: 19046400 | consumed tokens: 39007027200 | elapsed time per iteration (s): 4.24 | learning rate: 3.080E-05 | global batch size: 512 | lm loss: 1.937857E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.717 | TFLOPs: 56.26 | 7: iteration 37210/ 44073 | consumed samples: 19051520 | consumed tokens: 39017512960 | elapsed time per iteration (s): 4.24 | learning rate: 3.077E-05 | global batch size: 512 | lm loss: 1.925224E+00 | grad norm: 0.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.666 | TFLOPs: 56.24 | 7: iteration 37220/ 44073 | consumed samples: 19056640 | consumed tokens: 39027998720 | elapsed time per iteration (s): 4.24 | learning rate: 3.074E-05 | global batch size: 512 | lm loss: 1.935490E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.825 | TFLOPs: 56.31 | 7: iteration 37230/ 44073 | consumed samples: 19061760 | consumed tokens: 39038484480 | elapsed time per iteration (s): 4.19 | learning rate: 3.071E-05 | global batch size: 512 | lm loss: 1.922610E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.148 | TFLOPs: 56.93 | 7: iteration 37240/ 44073 | consumed samples: 19066880 | consumed tokens: 39048970240 | elapsed time per iteration (s): 4.20 | learning rate: 3.068E-05 | global batch size: 512 | lm loss: 1.941894E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.951 | TFLOPs: 56.84 | 7: iteration 37250/ 44073 | consumed samples: 19072000 | consumed tokens: 39059456000 | elapsed time per iteration (s): 4.21 | learning rate: 3.065E-05 | global batch size: 512 | lm loss: 1.940060E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.727 | TFLOPs: 56.73 | 7: iteration 37260/ 44073 | consumed samples: 19077120 | consumed tokens: 39069941760 | elapsed time per iteration (s): 4.23 | learning rate: 3.061E-05 | global batch size: 512 | lm loss: 1.946070E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.081 | TFLOPs: 56.43 | 7: iteration 37270/ 44073 | consumed samples: 19082240 | consumed tokens: 39080427520 | elapsed time per iteration (s): 4.18 | learning rate: 3.058E-05 | global batch size: 512 | lm loss: 1.929292E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.535 | TFLOPs: 57.11 | 7: iteration 37280/ 44073 | consumed samples: 19087360 | consumed tokens: 39090913280 | elapsed time per iteration (s): 4.24 | learning rate: 3.055E-05 | global batch size: 512 | lm loss: 1.929730E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.686 | TFLOPs: 56.25 | 7: iteration 37290/ 44073 | consumed samples: 19092480 | consumed tokens: 39101399040 | elapsed time per iteration (s): 4.20 | learning rate: 3.052E-05 | global batch size: 512 | lm loss: 1.923686E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.907 | TFLOPs: 56.81 | 7: iteration 37300/ 44073 | consumed samples: 19097600 | consumed tokens: 39111884800 | elapsed time per iteration (s): 4.21 | learning rate: 3.049E-05 | global batch size: 512 | lm loss: 1.928727E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.725 | TFLOPs: 56.73 | 7: iteration 37310/ 44073 | consumed samples: 19102720 | consumed tokens: 39122370560 | elapsed time per iteration (s): 4.18 | learning rate: 3.046E-05 | global batch size: 512 | lm loss: 1.932208E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.376 | TFLOPs: 57.03 | 7: iteration 37320/ 44073 | consumed samples: 19107840 | consumed tokens: 39132856320 | elapsed time per iteration (s): 4.27 | learning rate: 3.043E-05 | global batch size: 512 | lm loss: 1.908612E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.972 | TFLOPs: 55.91 | 7: iteration 37330/ 44073 | consumed samples: 19112960 | consumed tokens: 39143342080 | elapsed time per iteration (s): 4.19 | learning rate: 3.040E-05 | global batch size: 512 | lm loss: 1.924214E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.304 | TFLOPs: 57.00 | 7: iteration 37340/ 44073 | consumed samples: 19118080 | consumed tokens: 39153827840 | elapsed time per iteration (s): 4.22 | learning rate: 3.037E-05 | global batch size: 512 | lm loss: 1.939268E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.406 | TFLOPs: 56.58 | 7: iteration 37350/ 44073 | consumed samples: 19123200 | consumed tokens: 39164313600 | elapsed time per iteration (s): 4.22 | learning rate: 3.034E-05 | global batch size: 512 | lm loss: 1.926481E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.364 | TFLOPs: 56.56 | 7: iteration 37360/ 44073 | consumed samples: 19128320 | consumed tokens: 39174799360 | elapsed time per iteration (s): 4.19 | learning rate: 3.031E-05 | global batch size: 512 | lm loss: 1.948712E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.081 | TFLOPs: 56.90 | 7: iteration 37370/ 44073 | consumed samples: 19133440 | consumed tokens: 39185285120 | elapsed time per iteration (s): 4.24 | learning rate: 3.028E-05 | global batch size: 512 | lm loss: 1.913013E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.811 | TFLOPs: 56.30 | 7: iteration 37380/ 44073 | consumed samples: 19138560 | consumed tokens: 39195770880 | elapsed time per iteration (s): 4.25 | learning rate: 3.025E-05 | global batch size: 512 | lm loss: 1.927390E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.445 | TFLOPs: 56.13 | 7: iteration 37390/ 44073 | consumed samples: 19143680 | consumed tokens: 39206256640 | elapsed time per iteration (s): 4.28 | learning rate: 3.022E-05 | global batch size: 512 | lm loss: 1.923930E+00 | grad norm: 0.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.696 | TFLOPs: 55.78 | 7: iteration 37400/ 44073 | consumed samples: 19148800 | consumed tokens: 39216742400 | elapsed time per iteration (s): 4.21 | learning rate: 3.019E-05 | global batch size: 512 | lm loss: 1.923522E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.484 | TFLOPs: 56.62 | 7: iteration 37410/ 44073 | consumed samples: 19153920 | consumed tokens: 39227228160 | elapsed time per iteration (s): 4.25 | learning rate: 3.016E-05 | global batch size: 512 | lm loss: 1.928330E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.555 | TFLOPs: 56.18 | 7: iteration 37420/ 44073 | consumed samples: 19159040 | consumed tokens: 39237713920 | elapsed time per iteration (s): 4.17 | learning rate: 3.013E-05 | global batch size: 512 | lm loss: 1.943831E+00 | grad norm: 0.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.755 | TFLOPs: 57.21 | 7: iteration 37430/ 44073 | consumed samples: 19164160 | consumed tokens: 39248199680 | elapsed time per iteration (s): 4.17 | learning rate: 3.010E-05 | global batch size: 512 | lm loss: 1.914619E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.660 | TFLOPs: 57.17 | 7: iteration 37440/ 44073 | consumed samples: 19169280 | consumed tokens: 39258685440 | elapsed time per iteration (s): 4.26 | learning rate: 3.007E-05 | global batch size: 512 | lm loss: 1.923057E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.106 | TFLOPs: 55.98 | 7: iteration 37450/ 44073 | consumed samples: 19174400 | consumed tokens: 39269171200 | elapsed time per iteration (s): 4.19 | learning rate: 3.004E-05 | global batch size: 512 | lm loss: 1.937003E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.073 | TFLOPs: 56.89 | 7: iteration 37460/ 44073 | consumed samples: 19179520 | consumed tokens: 39279656960 | elapsed time per iteration (s): 4.16 | learning rate: 3.001E-05 | global batch size: 512 | lm loss: 1.968548E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.025 | TFLOPs: 57.34 | 7: iteration 37470/ 44073 | consumed samples: 19184640 | consumed tokens: 39290142720 | elapsed time per iteration (s): 4.20 | learning rate: 2.998E-05 | global batch size: 512 | lm loss: 1.942165E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.904 | TFLOPs: 56.81 | 7: iteration 37480/ 44073 | consumed samples: 19189760 | consumed tokens: 39300628480 | elapsed time per iteration (s): 6.21 | learning rate: 2.995E-05 | global batch size: 512 | lm loss: 1.920093E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 82.496 | TFLOPs: 38.45 | 7: iteration 37490/ 44073 | consumed samples: 19194880 | consumed tokens: 39311114240 | elapsed time per iteration (s): 4.18 | learning rate: 2.992E-05 | global batch size: 512 | lm loss: 1.967134E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.438 | TFLOPs: 57.06 | 7: iteration 37500/ 44073 | consumed samples: 19200000 | consumed tokens: 39321600000 | elapsed time per iteration (s): 4.23 | learning rate: 2.989E-05 | global batch size: 512 | lm loss: 1.953555E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.032 | TFLOPs: 56.41 | 7: iteration 37510/ 44073 | consumed samples: 19205120 | consumed tokens: 39332085760 | elapsed time per iteration (s): 4.22 | learning rate: 2.986E-05 | global batch size: 512 | lm loss: 1.901050E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.274 | TFLOPs: 56.52 | 7: iteration 37520/ 44073 | consumed samples: 19210240 | consumed tokens: 39342571520 | elapsed time per iteration (s): 4.21 | learning rate: 2.983E-05 | global batch size: 512 | lm loss: 1.929166E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.547 | TFLOPs: 56.65 | 7: iteration 37530/ 44073 | consumed samples: 19215360 | consumed tokens: 39353057280 | elapsed time per iteration (s): 4.23 | learning rate: 2.981E-05 | global batch size: 512 | lm loss: 1.916693E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.087 | TFLOPs: 56.43 | 7: iteration 37540/ 44073 | consumed samples: 19220480 | consumed tokens: 39363543040 | elapsed time per iteration (s): 4.15 | learning rate: 2.978E-05 | global batch size: 512 | lm loss: 1.922630E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.491 | TFLOPs: 57.55 | 7: iteration 37550/ 44073 | consumed samples: 19225600 | consumed tokens: 39374028800 | elapsed time per iteration (s): 4.17 | learning rate: 2.975E-05 | global batch size: 512 | lm loss: 1.912852E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.860 | TFLOPs: 57.26 | 7: iteration 37560/ 44073 | consumed samples: 19230720 | consumed tokens: 39384514560 | elapsed time per iteration (s): 4.23 | learning rate: 2.972E-05 | global batch size: 512 | lm loss: 1.911393E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.137 | TFLOPs: 56.46 | 7: iteration 37570/ 44073 | consumed samples: 19235840 | consumed tokens: 39395000320 | elapsed time per iteration (s): 4.21 | learning rate: 2.969E-05 | global batch size: 512 | lm loss: 1.932866E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.690 | TFLOPs: 56.71 | 7: iteration 37580/ 44073 | consumed samples: 19240960 | consumed tokens: 39405486080 | elapsed time per iteration (s): 4.21 | learning rate: 2.966E-05 | global batch size: 512 | lm loss: 1.935366E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.747 | TFLOPs: 56.74 | 7: iteration 37590/ 44073 | consumed samples: 19246080 | consumed tokens: 39415971840 | elapsed time per iteration (s): 4.18 | learning rate: 2.963E-05 | global batch size: 512 | lm loss: 1.936556E+00 | grad norm: 0.146 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.343 | TFLOPs: 57.02 | 7: iteration 37600/ 44073 | consumed samples: 19251200 | consumed tokens: 39426457600 | elapsed time per iteration (s): 4.19 | learning rate: 2.960E-05 | global batch size: 512 | lm loss: 1.929664E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.149 | TFLOPs: 56.93 | 7: iteration 37610/ 44073 | consumed samples: 19256320 | consumed tokens: 39436943360 | elapsed time per iteration (s): 4.16 | learning rate: 2.957E-05 | global batch size: 512 | lm loss: 1.958692E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.957 | TFLOPs: 57.30 | 7: iteration 37620/ 44073 | consumed samples: 19261440 | consumed tokens: 39447429120 | elapsed time per iteration (s): 4.20 | learning rate: 2.954E-05 | global batch size: 512 | lm loss: 1.948177E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.868 | TFLOPs: 56.80 | 7: iteration 37630/ 44073 | consumed samples: 19266560 | consumed tokens: 39457914880 | elapsed time per iteration (s): 4.19 | learning rate: 2.951E-05 | global batch size: 512 | lm loss: 1.923254E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.289 | TFLOPs: 56.99 | 7: iteration 37640/ 44073 | consumed samples: 19271680 | consumed tokens: 39468400640 | elapsed time per iteration (s): 4.36 | learning rate: 2.948E-05 | global batch size: 512 | lm loss: 1.933539E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 117.337 | TFLOPs: 54.69 | 7: iteration 37650/ 44073 | consumed samples: 19276800 | consumed tokens: 39478886400 | elapsed time per iteration (s): 4.20 | learning rate: 2.946E-05 | global batch size: 512 | lm loss: 1.953507E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.771 | TFLOPs: 56.75 | 7: iteration 37660/ 44073 | consumed samples: 19281920 | consumed tokens: 39489372160 | elapsed time per iteration (s): 4.16 | learning rate: 2.943E-05 | global batch size: 512 | lm loss: 1.902370E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.031 | TFLOPs: 57.34 | 7: iteration 37670/ 44073 | consumed samples: 19287040 | consumed tokens: 39499857920 | elapsed time per iteration (s): 4.24 | learning rate: 2.940E-05 | global batch size: 512 | lm loss: 1.913028E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.675 | TFLOPs: 56.24 | 7: iteration 37680/ 44073 | consumed samples: 19292160 | consumed tokens: 39510343680 | elapsed time per iteration (s): 4.18 | learning rate: 2.937E-05 | global batch size: 512 | lm loss: 1.912066E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.543 | TFLOPs: 57.11 | 7: iteration 37690/ 44073 | consumed samples: 19297280 | consumed tokens: 39520829440 | elapsed time per iteration (s): 4.19 | learning rate: 2.934E-05 | global batch size: 512 | lm loss: 1.921490E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.177 | TFLOPs: 56.94 | 7: iteration 37700/ 44073 | consumed samples: 19302400 | consumed tokens: 39531315200 | elapsed time per iteration (s): 4.23 | learning rate: 2.931E-05 | global batch size: 512 | lm loss: 1.938074E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.967 | TFLOPs: 56.38 | 7: iteration 37710/ 44073 | consumed samples: 19307520 | consumed tokens: 39541800960 | elapsed time per iteration (s): 4.21 | learning rate: 2.928E-05 | global batch size: 512 | lm loss: 1.934907E+00 | grad norm: 0.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.712 | TFLOPs: 56.72 | 7: iteration 37720/ 44073 | consumed samples: 19312640 | consumed tokens: 39552286720 | elapsed time per iteration (s): 4.15 | learning rate: 2.925E-05 | global batch size: 512 | lm loss: 1.924111E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.256 | TFLOPs: 57.44 | 7: iteration 37730/ 44073 | consumed samples: 19317760 | consumed tokens: 39562772480 | elapsed time per iteration (s): 4.24 | learning rate: 2.923E-05 | global batch size: 512 | lm loss: 1.925191E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.720 | TFLOPs: 56.26 | 7: iteration 37740/ 44073 | consumed samples: 19322880 | consumed tokens: 39573258240 | elapsed time per iteration (s): 4.20 | learning rate: 2.920E-05 | global batch size: 512 | lm loss: 1.912551E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.819 | TFLOPs: 56.77 | 7: iteration 37750/ 44073 | consumed samples: 19328000 | consumed tokens: 39583744000 | elapsed time per iteration (s): 4.28 | learning rate: 2.917E-05 | global batch size: 512 | lm loss: 1.935373E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.641 | TFLOPs: 55.76 | 7: iteration 37760/ 44073 | consumed samples: 19333120 | consumed tokens: 39594229760 | elapsed time per iteration (s): 4.19 | learning rate: 2.914E-05 | global batch size: 512 | lm loss: 1.926135E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.220 | TFLOPs: 56.96 | 7: iteration 37770/ 44073 | consumed samples: 19338240 | consumed tokens: 39604715520 | elapsed time per iteration (s): 4.21 | learning rate: 2.911E-05 | global batch size: 512 | lm loss: 1.915734E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.492 | TFLOPs: 56.62 | 7: iteration 37780/ 44073 | consumed samples: 19343360 | consumed tokens: 39615201280 | elapsed time per iteration (s): 4.18 | learning rate: 2.908E-05 | global batch size: 512 | lm loss: 1.926925E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.420 | TFLOPs: 57.05 | 7: iteration 37790/ 44073 | consumed samples: 19348480 | consumed tokens: 39625687040 | elapsed time per iteration (s): 4.22 | learning rate: 2.905E-05 | global batch size: 512 | lm loss: 1.942606E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.381 | TFLOPs: 56.57 | 7: iteration 37800/ 44073 | consumed samples: 19353600 | consumed tokens: 39636172800 | elapsed time per iteration (s): 4.21 | learning rate: 2.903E-05 | global batch size: 512 | lm loss: 1.928906E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.752 | TFLOPs: 56.74 | 7: iteration 37810/ 44073 | consumed samples: 19358720 | consumed tokens: 39646658560 | elapsed time per iteration (s): 4.20 | learning rate: 2.900E-05 | global batch size: 512 | lm loss: 1.916573E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.901 | TFLOPs: 56.81 | 7: iteration 37820/ 44073 | consumed samples: 19363840 | consumed tokens: 39657144320 | elapsed time per iteration (s): 4.15 | learning rate: 2.897E-05 | global batch size: 512 | lm loss: 1.944027E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.280 | TFLOPs: 57.45 | 7: iteration 37830/ 44073 | consumed samples: 19368960 | consumed tokens: 39667630080 | elapsed time per iteration (s): 4.18 | learning rate: 2.894E-05 | global batch size: 512 | lm loss: 1.929021E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.634 | TFLOPs: 57.15 | 7: iteration 37840/ 44073 | consumed samples: 19374080 | consumed tokens: 39678115840 | elapsed time per iteration (s): 4.24 | learning rate: 2.891E-05 | global batch size: 512 | lm loss: 1.922675E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.644 | TFLOPs: 56.23 | 7: iteration 37850/ 44073 | consumed samples: 19379200 | consumed tokens: 39688601600 | elapsed time per iteration (s): 4.17 | learning rate: 2.889E-05 | global batch size: 512 | lm loss: 1.921364E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.777 | TFLOPs: 57.22 | 7: iteration 37860/ 44073 | consumed samples: 19384320 | consumed tokens: 39699087360 | elapsed time per iteration (s): 4.14 | learning rate: 2.886E-05 | global batch size: 512 | lm loss: 1.909801E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.593 | TFLOPs: 57.60 | 7: iteration 37870/ 44073 | consumed samples: 19389440 | consumed tokens: 39709573120 | elapsed time per iteration (s): 4.16 | learning rate: 2.883E-05 | global batch size: 512 | lm loss: 1.917954E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.179 | TFLOPs: 57.41 | 7: iteration 37880/ 44073 | consumed samples: 19394560 | consumed tokens: 39720058880 | elapsed time per iteration (s): 4.18 | learning rate: 2.880E-05 | global batch size: 512 | lm loss: 1.927843E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.603 | TFLOPs: 57.14 | 7: iteration 37890/ 44073 | consumed samples: 19399680 | consumed tokens: 39730544640 | elapsed time per iteration (s): 4.19 | learning rate: 2.877E-05 | global batch size: 512 | lm loss: 1.943332E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.070 | TFLOPs: 56.89 | 7: iteration 37900/ 44073 | consumed samples: 19404800 | consumed tokens: 39741030400 | elapsed time per iteration (s): 4.41 | learning rate: 2.875E-05 | global batch size: 512 | lm loss: 1.911366E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 116.092 | TFLOPs: 54.10 | 7: iteration 37910/ 44073 | consumed samples: 19409920 | consumed tokens: 39751516160 | elapsed time per iteration (s): 4.28 | learning rate: 2.872E-05 | global batch size: 512 | lm loss: 1.925274E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.734 | TFLOPs: 55.80 | 7: iteration 37920/ 44073 | consumed samples: 19415040 | consumed tokens: 39762001920 | elapsed time per iteration (s): 4.32 | learning rate: 2.869E-05 | global batch size: 512 | lm loss: 1.938769E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.583 | TFLOPs: 55.27 | 7: iteration 37930/ 44073 | consumed samples: 19420160 | consumed tokens: 39772487680 | elapsed time per iteration (s): 4.21 | learning rate: 2.866E-05 | global batch size: 512 | lm loss: 1.928872E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.667 | TFLOPs: 56.70 | 7: iteration 37940/ 44073 | consumed samples: 19425280 | consumed tokens: 39782973440 | elapsed time per iteration (s): 4.16 | learning rate: 2.863E-05 | global batch size: 512 | lm loss: 1.932419E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.050 | TFLOPs: 57.35 | 7: iteration 37950/ 44073 | consumed samples: 19430400 | consumed tokens: 39793459200 | elapsed time per iteration (s): 4.24 | learning rate: 2.861E-05 | global batch size: 512 | lm loss: 1.923417E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.794 | TFLOPs: 56.30 | 7: iteration 37960/ 44073 | consumed samples: 19435520 | consumed tokens: 39803944960 | elapsed time per iteration (s): 4.22 | learning rate: 2.858E-05 | global batch size: 512 | lm loss: 1.927261E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.377 | TFLOPs: 56.57 | 7: iteration 37970/ 44073 | consumed samples: 19440640 | consumed tokens: 39814430720 | elapsed time per iteration (s): 4.18 | learning rate: 2.855E-05 | global batch size: 512 | lm loss: 1.904337E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.485 | TFLOPs: 57.08 | 7: iteration 37980/ 44073 | consumed samples: 19445760 | consumed tokens: 39824916480 | elapsed time per iteration (s): 4.19 | learning rate: 2.852E-05 | global batch size: 512 | lm loss: 1.929602E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.340 | TFLOPs: 57.02 | 7: iteration 37990/ 44073 | consumed samples: 19450880 | consumed tokens: 39835402240 | elapsed time per iteration (s): 4.20 | learning rate: 2.850E-05 | global batch size: 512 | lm loss: 1.947832E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.951 | TFLOPs: 56.84 | 0: [2022-11-27 21:52:41,327] [INFO] [logging.py:68:log_dist] [Rank 0] step=38000, skipped=0, lr=[2.8469299105434998e-05, 2.8469299105434998e-05, 2.8469299105434998e-05], mom=[(0.9, 0.999), (0.9, 0.999), (0.9, 0.999)] 0: steps: 38000 loss: 1.8531 iter time (s): 2.124 samples/sec: 241.097 7: iteration 38000/ 44073 | consumed samples: 19456000 | consumed tokens: 39845888000 | elapsed time per iteration (s): 4.20 | learning rate: 2.847E-05 | global batch size: 512 | lm loss: 1.925055E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.988 | TFLOPs: 56.85 | 7: ------------------------------------------------------------------------------------------- 7: valid loss at iteration 38000 | lm loss value: 1.914729E+00 | lm loss PPL: 6.785099E+00 | 7: ------------------------------------------------------------------------------------------- 0: saving checkpoint at iteration 38000 to checkpoints_2b2 0: [2022-11-27 21:52:43,657] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step38000 is begin to save! 0: [2022-11-27 21:52:43,874] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_01-model_00-model_states.pt... 0: [2022-11-27 21:52:44,252] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_01-model_00-model_states.pt. 0: [2022-11-27 21:52:44,253] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_03-model_00-model_states.pt... 0: [2022-11-27 21:52:44,402] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_03-model_00-model_states.pt. 0: [2022-11-27 21:52:44,403] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_04-model_00-model_states.pt... 0: [2022-11-27 21:52:44,546] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_04-model_00-model_states.pt. 0: [2022-11-27 21:52:44,546] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_05-model_00-model_states.pt... 0: [2022-11-27 21:52:44,687] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_05-model_00-model_states.pt. 0: [2022-11-27 21:52:44,687] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_06-model_00-model_states.pt... 0: [2022-11-27 21:52:44,829] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_06-model_00-model_states.pt. 0: [2022-11-27 21:52:44,830] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_07-model_00-model_states.pt... 0: [2022-11-27 21:52:44,968] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_07-model_00-model_states.pt. 0: [2022-11-27 21:52:44,968] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_08-model_00-model_states.pt... 0: [2022-11-27 21:52:45,113] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_08-model_00-model_states.pt. 0: [2022-11-27 21:52:45,114] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_09-model_00-model_states.pt... 0: [2022-11-27 21:52:45,259] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_09-model_00-model_states.pt. 0: [2022-11-27 21:52:45,259] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_10-model_00-model_states.pt... 0: [2022-11-27 21:52:45,403] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_10-model_00-model_states.pt. 0: [2022-11-27 21:52:45,404] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_11-model_00-model_states.pt... 0: [2022-11-27 21:52:45,546] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_11-model_00-model_states.pt. 0: [2022-11-27 21:52:45,547] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_12-model_00-model_states.pt... 0: [2022-11-27 21:52:45,685] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_12-model_00-model_states.pt. 0: [2022-11-27 21:52:45,685] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_13-model_00-model_states.pt... 0: [2022-11-27 21:52:45,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_13-model_00-model_states.pt. 0: [2022-11-27 21:52:45,830] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_14-model_00-model_states.pt... 0: [2022-11-27 21:52:45,967] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_14-model_00-model_states.pt. 0: [2022-11-27 21:52:45,967] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_15-model_00-model_states.pt... 0: [2022-11-27 21:52:46,107] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_15-model_00-model_states.pt. 0: [2022-11-27 21:52:46,107] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_16-model_00-model_states.pt... 0: [2022-11-27 21:52:46,242] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_16-model_00-model_states.pt. 0: [2022-11-27 21:52:46,243] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_17-model_00-model_states.pt... 0: [2022-11-27 21:52:46,382] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_17-model_00-model_states.pt. 0: [2022-11-27 21:52:46,383] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_18-model_00-model_states.pt... 0: [2022-11-27 21:52:46,520] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_18-model_00-model_states.pt. 0: [2022-11-27 21:52:46,521] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_19-model_00-model_states.pt... 0: [2022-11-27 21:52:46,656] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_19-model_00-model_states.pt. 0: [2022-11-27 21:52:46,657] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_20-model_00-model_states.pt... 0: [2022-11-27 21:52:46,794] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_20-model_00-model_states.pt. 0: [2022-11-27 21:52:46,795] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_21-model_00-model_states.pt... 0: [2022-11-27 21:52:46,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_21-model_00-model_states.pt. 0: [2022-11-27 21:52:46,933] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_22-model_00-model_states.pt... 0: [2022-11-27 21:52:47,073] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_22-model_00-model_states.pt. 0: [2022-11-27 21:52:47,073] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_23-model_00-model_states.pt... 0: [2022-11-27 21:52:47,209] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_23-model_00-model_states.pt. 0: [2022-11-27 21:52:47,210] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_24-model_00-model_states.pt... 0: [2022-11-27 21:52:47,349] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_24-model_00-model_states.pt. 0: [2022-11-27 21:52:47,349] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_25-model_00-model_states.pt... 0: [2022-11-27 21:52:47,488] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_25-model_00-model_states.pt. 0: [2022-11-27 21:52:47,488] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_26-model_00-model_states.pt... 0: [2022-11-27 21:52:47,623] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_26-model_00-model_states.pt. 0: [2022-11-27 21:52:47,623] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_27-model_00-model_states.pt... 0: [2022-11-27 21:52:47,762] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_27-model_00-model_states.pt. 0: [2022-11-27 21:52:47,763] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_28-model_00-model_states.pt... 0: [2022-11-27 21:52:47,900] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_28-model_00-model_states.pt. 0: [2022-11-27 21:52:47,901] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_29-model_00-model_states.pt... 0: [2022-11-27 21:52:48,037] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_29-model_00-model_states.pt. 0: [2022-11-27 21:52:48,038] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_30-model_00-model_states.pt... 0: [2022-11-27 21:52:48,177] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_30-model_00-model_states.pt. 0: [2022-11-27 21:52:48,177] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_31-model_00-model_states.pt... 0: [2022-11-27 21:52:48,313] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_31-model_00-model_states.pt. 0: [2022-11-27 21:52:48,314] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_32-model_00-model_states.pt... 0: [2022-11-27 21:52:48,448] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_32-model_00-model_states.pt. 0: [2022-11-27 21:52:48,448] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_33-model_00-model_states.pt... 0: [2022-11-27 21:52:48,588] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_33-model_00-model_states.pt. 0: [2022-11-27 21:52:48,589] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_34-model_00-model_states.pt... 0: [2022-11-27 21:52:48,723] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_34-model_00-model_states.pt. 0: [2022-11-27 21:52:48,724] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/layer_36-model_00-model_states.pt... 0: [2022-11-27 21:52:48,725] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/layer_36-model_00-model_states.pt. 0: [2022-11-27 21:52:48,726] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step38000/mp_rank_00_model_states.pt 0: [2022-11-27 21:52:48,726] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/mp_rank_00_model_states.pt... 0: [2022-11-27 21:52:48,732] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/mp_rank_00_model_states.pt. 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 4: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 3: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 5: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 2: [2022-11-27 21:52:48,774] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step38000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 0: [2022-11-27 21:52:49,311] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,311] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:49,311] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: [2022-11-27 21:52:49,312] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,312] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:49,312] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: [2022-11-27 21:52:49,314] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,336] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,336] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:49,336] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: [2022-11-27 21:52:49,337] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,337] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:49,337] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: [2022-11-27 21:52:49,349] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,349] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:49,349] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: [2022-11-27 21:52:49,389] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,389] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:49,389] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-27 21:52:49,390] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: [2022-11-27 21:52:49,390] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:49,390] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,543] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,543] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,543] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,543] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,543] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,543] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,543] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,565] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,565] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,565] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 2: [2022-11-27 21:52:49,565] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-27 21:52:49,565] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-27 21:52:49,565] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,604] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,604] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,604] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,604] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,604] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,604] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-27 21:52:49,605] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,605] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,605] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-27 21:52:49,605] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,605] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 4: [2022-11-27 21:52:49,605] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,737] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,737] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,737] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,737] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,737] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,737] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-27 21:52:49,738] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,738] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-27 21:52:49,738] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 3: [2022-11-27 21:52:49,738] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,806] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,806] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,806] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,806] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,806] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,806] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,807] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,807] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,825] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,825] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,825] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 1: [2022-11-27 21:52:49,826] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-27 21:52:49,826] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-27 21:52:49,826] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,858] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,858] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,858] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,858] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,859] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,859] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,859] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,859] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,859] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,859] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,859] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,859] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,864] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,864] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,864] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,864] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,864] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,864] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,865] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,865] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,865] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 6: [2022-11-27 21:52:49,865] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-27 21:52:49,866] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-27 21:52:49,866] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 5: [2022-11-27 21:52:49,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-27 21:52:49,893] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-27 21:52:49,894] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,933] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,933] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-27 21:52:49,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 7: [2022-11-27 21:52:49,933] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-27 21:52:49,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: [2022-11-27 21:52:50,006] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step38000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-27 21:52:50,006] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step38000 is ready now! 0: successfully saved checkpoint at iteration 38000 to checkpoints_2b2 7: time (ms) | save-checkpoint: 6453.89 7: iteration 38010/ 44073 | consumed samples: 19461120 | consumed tokens: 39856373760 | elapsed time per iteration (s): 5.06 | learning rate: 2.844E-05 | global batch size: 512 | lm loss: 1.915955E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 101.237 | TFLOPs: 47.18 | 7: iteration 38020/ 44073 | consumed samples: 19466240 | consumed tokens: 39866859520 | elapsed time per iteration (s): 4.22 | learning rate: 2.841E-05 | global batch size: 512 | lm loss: 1.925037E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.248 | TFLOPs: 56.51 | 7: iteration 38030/ 44073 | consumed samples: 19471360 | consumed tokens: 39877345280 | elapsed time per iteration (s): 4.21 | learning rate: 2.839E-05 | global batch size: 512 | lm loss: 1.942294E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.695 | TFLOPs: 56.72 | 7: iteration 38040/ 44073 | consumed samples: 19476480 | consumed tokens: 39887831040 | elapsed time per iteration (s): 4.22 | learning rate: 2.836E-05 | global batch size: 512 | lm loss: 1.928801E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.443 | TFLOPs: 56.60 | 7: iteration 38050/ 44073 | consumed samples: 19481600 | consumed tokens: 39898316800 | elapsed time per iteration (s): 4.19 | learning rate: 2.833E-05 | global batch size: 512 | lm loss: 1.934750E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.174 | TFLOPs: 56.94 | 7: iteration 38060/ 44073 | consumed samples: 19486720 | consumed tokens: 39908802560 | elapsed time per iteration (s): 4.22 | learning rate: 2.831E-05 | global batch size: 512 | lm loss: 1.922846E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.214 | TFLOPs: 56.49 | 7: iteration 38070/ 44073 | consumed samples: 19491840 | consumed tokens: 39919288320 | elapsed time per iteration (s): 4.48 | learning rate: 2.828E-05 | global batch size: 512 | lm loss: 1.937464E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 114.167 | TFLOPs: 53.21 | 7: iteration 38080/ 44073 | consumed samples: 19496960 | consumed tokens: 39929774080 | elapsed time per iteration (s): 4.22 | learning rate: 2.825E-05 | global batch size: 512 | lm loss: 1.947312E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.347 | TFLOPs: 56.55 | 7: iteration 38090/ 44073 | consumed samples: 19502080 | consumed tokens: 39940259840 | elapsed time per iteration (s): 4.20 | learning rate: 2.822E-05 | global batch size: 512 | lm loss: 1.930539E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.906 | TFLOPs: 56.81 | 7: iteration 38100/ 44073 | consumed samples: 19507200 | consumed tokens: 39950745600 | elapsed time per iteration (s): 4.18 | learning rate: 2.820E-05 | global batch size: 512 | lm loss: 1.931160E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.369 | TFLOPs: 57.03 | 7: iteration 38110/ 44073 | consumed samples: 19512320 | consumed tokens: 39961231360 | elapsed time per iteration (s): 4.19 | learning rate: 2.817E-05 | global batch size: 512 | lm loss: 1.922984E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.202 | TFLOPs: 56.95 | 7: iteration 38120/ 44073 | consumed samples: 19517440 | consumed tokens: 39971717120 | elapsed time per iteration (s): 4.21 | learning rate: 2.814E-05 | global batch size: 512 | lm loss: 1.934668E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.748 | TFLOPs: 56.74 | 7: iteration 38130/ 44073 | consumed samples: 19522560 | consumed tokens: 39982202880 | elapsed time per iteration (s): 4.17 | learning rate: 2.812E-05 | global batch size: 512 | lm loss: 1.918768E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.855 | TFLOPs: 57.26 | 7: iteration 38140/ 44073 | consumed samples: 19527680 | consumed tokens: 39992688640 | elapsed time per iteration (s): 4.17 | learning rate: 2.809E-05 | global batch size: 512 | lm loss: 1.918797E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.911 | TFLOPs: 57.28 | 7: iteration 38150/ 44073 | consumed samples: 19532800 | consumed tokens: 40003174400 | elapsed time per iteration (s): 4.20 | learning rate: 2.806E-05 | global batch size: 512 | lm loss: 1.919645E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.974 | TFLOPs: 56.85 | 7: iteration 38160/ 44073 | consumed samples: 19537920 | consumed tokens: 40013660160 | elapsed time per iteration (s): 4.15 | learning rate: 2.804E-05 | global batch size: 512 | lm loss: 1.918158E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.400 | TFLOPs: 57.51 | 7: iteration 38170/ 44073 | consumed samples: 19543040 | consumed tokens: 40024145920 | elapsed time per iteration (s): 4.16 | learning rate: 2.801E-05 | global batch size: 512 | lm loss: 1.966207E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.073 | TFLOPs: 57.36 | 7: iteration 38180/ 44073 | consumed samples: 19548160 | consumed tokens: 40034631680 | elapsed time per iteration (s): 4.20 | learning rate: 2.798E-05 | global batch size: 512 | lm loss: 1.934475E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.922 | TFLOPs: 56.82 | 7: iteration 38190/ 44073 | consumed samples: 19553280 | consumed tokens: 40045117440 | elapsed time per iteration (s): 4.17 | learning rate: 2.796E-05 | global batch size: 512 | lm loss: 1.922328E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.812 | TFLOPs: 57.24 | 7: iteration 38200/ 44073 | consumed samples: 19558400 | consumed tokens: 40055603200 | elapsed time per iteration (s): 4.14 | learning rate: 2.793E-05 | global batch size: 512 | lm loss: 1.920555E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.605 | TFLOPs: 57.61 | 7: iteration 38210/ 44073 | consumed samples: 19563520 | consumed tokens: 40066088960 | elapsed time per iteration (s): 4.20 | learning rate: 2.790E-05 | global batch size: 512 | lm loss: 1.942775E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.844 | TFLOPs: 56.79 | 7: iteration 38220/ 44073 | consumed samples: 19568640 | consumed tokens: 40076574720 | elapsed time per iteration (s): 4.20 | learning rate: 2.788E-05 | global batch size: 512 | lm loss: 1.924950E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.952 | TFLOPs: 56.84 | 7: iteration 38230/ 44073 | consumed samples: 19573760 | consumed tokens: 40087060480 | elapsed time per iteration (s): 4.15 | learning rate: 2.785E-05 | global batch size: 512 | lm loss: 1.925434E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.301 | TFLOPs: 57.46 | 7: iteration 38240/ 44073 | consumed samples: 19578880 | consumed tokens: 40097546240 | elapsed time per iteration (s): 4.15 | learning rate: 2.782E-05 | global batch size: 512 | lm loss: 1.929439E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.265 | TFLOPs: 57.45 | 7: iteration 38250/ 44073 | consumed samples: 19584000 | consumed tokens: 40108032000 | elapsed time per iteration (s): 4.17 | learning rate: 2.780E-05 | global batch size: 512 | lm loss: 1.903180E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.755 | TFLOPs: 57.21 | 7: iteration 38260/ 44073 | consumed samples: 19589120 | consumed tokens: 40118517760 | elapsed time per iteration (s): 4.19 | learning rate: 2.777E-05 | global batch size: 512 | lm loss: 1.927915E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.112 | TFLOPs: 56.91 | 7: iteration 38270/ 44073 | consumed samples: 19594240 | consumed tokens: 40129003520 | elapsed time per iteration (s): 4.19 | learning rate: 2.774E-05 | global batch size: 512 | lm loss: 1.922294E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.167 | TFLOPs: 56.94 | 7: iteration 38280/ 44073 | consumed samples: 19599360 | consumed tokens: 40139489280 | elapsed time per iteration (s): 4.15 | learning rate: 2.772E-05 | global batch size: 512 | lm loss: 1.942444E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.431 | TFLOPs: 57.53 | 7: iteration 38290/ 44073 | consumed samples: 19604480 | consumed tokens: 40149975040 | elapsed time per iteration (s): 4.17 | learning rate: 2.769E-05 | global batch size: 512 | lm loss: 1.938389E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.658 | TFLOPs: 57.16 | 7: iteration 38300/ 44073 | consumed samples: 19609600 | consumed tokens: 40160460800 | elapsed time per iteration (s): 4.16 | learning rate: 2.767E-05 | global batch size: 512 | lm loss: 1.911879E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.001 | TFLOPs: 57.32 | 7: iteration 38310/ 44073 | consumed samples: 19614720 | consumed tokens: 40170946560 | elapsed time per iteration (s): 4.16 | learning rate: 2.764E-05 | global batch size: 512 | lm loss: 1.923204E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.117 | TFLOPs: 57.38 | 7: iteration 38320/ 44073 | consumed samples: 19619840 | consumed tokens: 40181432320 | elapsed time per iteration (s): 4.17 | learning rate: 2.761E-05 | global batch size: 512 | lm loss: 1.946447E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.850 | TFLOPs: 57.25 | 7: iteration 38330/ 44073 | consumed samples: 19624960 | consumed tokens: 40191918080 | elapsed time per iteration (s): 4.21 | learning rate: 2.759E-05 | global batch size: 512 | lm loss: 1.922754E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.649 | TFLOPs: 56.69 | 7: iteration 38340/ 44073 | consumed samples: 19630080 | consumed tokens: 40202403840 | elapsed time per iteration (s): 4.19 | learning rate: 2.756E-05 | global batch size: 512 | lm loss: 1.916012E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.279 | TFLOPs: 56.99 | 7: iteration 38350/ 44073 | consumed samples: 19635200 | consumed tokens: 40212889600 | elapsed time per iteration (s): 4.17 | learning rate: 2.753E-05 | global batch size: 512 | lm loss: 1.921148E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.864 | TFLOPs: 57.26 | 7: iteration 38360/ 44073 | consumed samples: 19640320 | consumed tokens: 40223375360 | elapsed time per iteration (s): 4.19 | learning rate: 2.751E-05 | global batch size: 512 | lm loss: 1.906605E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.071 | TFLOPs: 56.89 | 7: iteration 38370/ 44073 | consumed samples: 19645440 | consumed tokens: 40233861120 | elapsed time per iteration (s): 4.15 | learning rate: 2.748E-05 | global batch size: 512 | lm loss: 1.939078E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.322 | TFLOPs: 57.47 | 7: iteration 38380/ 44073 | consumed samples: 19650560 | consumed tokens: 40244346880 | elapsed time per iteration (s): 4.17 | learning rate: 2.746E-05 | global batch size: 512 | lm loss: 1.935259E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.797 | TFLOPs: 57.23 | 7: iteration 38390/ 44073 | consumed samples: 19655680 | consumed tokens: 40254832640 | elapsed time per iteration (s): 4.24 | learning rate: 2.743E-05 | global batch size: 512 | lm loss: 1.918626E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.646 | TFLOPs: 56.23 | 7: iteration 38400/ 44073 | consumed samples: 19660800 | consumed tokens: 40265318400 | elapsed time per iteration (s): 4.18 | learning rate: 2.741E-05 | global batch size: 512 | lm loss: 1.916468E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.620 | TFLOPs: 57.15 | 7: iteration 38410/ 44073 | consumed samples: 19665920 | consumed tokens: 40275804160 | elapsed time per iteration (s): 4.20 | learning rate: 2.738E-05 | global batch size: 512 | lm loss: 1.920531E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.012 | TFLOPs: 56.86 | 7: iteration 38420/ 44073 | consumed samples: 19671040 | consumed tokens: 40286289920 | elapsed time per iteration (s): 4.19 | learning rate: 2.735E-05 | global batch size: 512 | lm loss: 1.937711E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.136 | TFLOPs: 56.92 | 7: iteration 38430/ 44073 | consumed samples: 19676160 | consumed tokens: 40296775680 | elapsed time per iteration (s): 4.18 | learning rate: 2.733E-05 | global batch size: 512 | lm loss: 1.931218E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.510 | TFLOPs: 57.10 | 7: iteration 38440/ 44073 | consumed samples: 19681280 | consumed tokens: 40307261440 | elapsed time per iteration (s): 4.15 | learning rate: 2.730E-05 | global batch size: 512 | lm loss: 1.940355E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.264 | TFLOPs: 57.45 | 7: iteration 38450/ 44073 | consumed samples: 19686400 | consumed tokens: 40317747200 | elapsed time per iteration (s): 4.22 | learning rate: 2.728E-05 | global batch size: 512 | lm loss: 1.916402E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.334 | TFLOPs: 56.55 | 7: iteration 38460/ 44073 | consumed samples: 19691520 | consumed tokens: 40328232960 | elapsed time per iteration (s): 4.23 | learning rate: 2.725E-05 | global batch size: 512 | lm loss: 1.945276E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.170 | TFLOPs: 56.47 | 7: iteration 38470/ 44073 | consumed samples: 19696640 | consumed tokens: 40338718720 | elapsed time per iteration (s): 4.19 | learning rate: 2.723E-05 | global batch size: 512 | lm loss: 1.941551E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.166 | TFLOPs: 56.94 | 7: iteration 38480/ 44073 | consumed samples: 19701760 | consumed tokens: 40349204480 | elapsed time per iteration (s): 4.13 | learning rate: 2.720E-05 | global batch size: 512 | lm loss: 1.928037E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.847 | TFLOPs: 57.72 | 7: iteration 38490/ 44073 | consumed samples: 19706880 | consumed tokens: 40359690240 | elapsed time per iteration (s): 4.17 | learning rate: 2.718E-05 | global batch size: 512 | lm loss: 1.928116E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.802 | TFLOPs: 57.23 | 7: iteration 38500/ 44073 | consumed samples: 19712000 | consumed tokens: 40370176000 | elapsed time per iteration (s): 4.19 | learning rate: 2.715E-05 | global batch size: 512 | lm loss: 1.933409E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.146 | TFLOPs: 56.93 | 7: iteration 38510/ 44073 | consumed samples: 19717120 | consumed tokens: 40380661760 | elapsed time per iteration (s): 4.23 | learning rate: 2.712E-05 | global batch size: 512 | lm loss: 1.918883E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.034 | TFLOPs: 56.41 | 7: iteration 38520/ 44073 | consumed samples: 19722240 | consumed tokens: 40391147520 | elapsed time per iteration (s): 4.15 | learning rate: 2.710E-05 | global batch size: 512 | lm loss: 1.927224E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.339 | TFLOPs: 57.48 | 7: iteration 38530/ 44073 | consumed samples: 19727360 | consumed tokens: 40401633280 | elapsed time per iteration (s): 4.16 | learning rate: 2.707E-05 | global batch size: 512 | lm loss: 1.927451E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.028 | TFLOPs: 57.34 | 7: iteration 38540/ 44073 | consumed samples: 19732480 | consumed tokens: 40412119040 | elapsed time per iteration (s): 4.13 | learning rate: 2.705E-05 | global batch size: 512 | lm loss: 1.926416E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.898 | TFLOPs: 57.74 | 7: iteration 38550/ 44073 | consumed samples: 19737600 | consumed tokens: 40422604800 | elapsed time per iteration (s): 4.19 | learning rate: 2.702E-05 | global batch size: 512 | lm loss: 1.927271E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.134 | TFLOPs: 56.92 | 7: iteration 38560/ 44073 | consumed samples: 19742720 | consumed tokens: 40433090560 | elapsed time per iteration (s): 4.19 | learning rate: 2.700E-05 | global batch size: 512 | lm loss: 1.935426E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.089 | TFLOPs: 56.90 | 7: iteration 38570/ 44073 | consumed samples: 19747840 | consumed tokens: 40443576320 | elapsed time per iteration (s): 4.15 | learning rate: 2.697E-05 | global batch size: 512 | lm loss: 1.925167E+00 | grad norm: 0.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.298 | TFLOPs: 57.46 | 7: iteration 38580/ 44073 | consumed samples: 19752960 | consumed tokens: 40454062080 | elapsed time per iteration (s): 4.15 | learning rate: 2.695E-05 | global batch size: 512 | lm loss: 1.921805E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.354 | TFLOPs: 57.49 | 7: iteration 38590/ 44073 | consumed samples: 19758080 | consumed tokens: 40464547840 | elapsed time per iteration (s): 4.16 | learning rate: 2.692E-05 | global batch size: 512 | lm loss: 1.924649E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.213 | TFLOPs: 57.42 | 7: iteration 38600/ 44073 | consumed samples: 19763200 | consumed tokens: 40475033600 | elapsed time per iteration (s): 4.16 | learning rate: 2.690E-05 | global batch size: 512 | lm loss: 1.908679E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.984 | TFLOPs: 57.32 | 7: iteration 38610/ 44073 | consumed samples: 19768320 | consumed tokens: 40485519360 | elapsed time per iteration (s): 4.18 | learning rate: 2.687E-05 | global batch size: 512 | lm loss: 1.923865E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.590 | TFLOPs: 57.13 | 7: iteration 38620/ 44073 | consumed samples: 19773440 | consumed tokens: 40496005120 | elapsed time per iteration (s): 4.21 | learning rate: 2.685E-05 | global batch size: 512 | lm loss: 1.935399E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.729 | TFLOPs: 56.73 | 7: iteration 38630/ 44073 | consumed samples: 19778560 | consumed tokens: 40506490880 | elapsed time per iteration (s): 4.16 | learning rate: 2.682E-05 | global batch size: 512 | lm loss: 1.927016E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.934 | TFLOPs: 57.29 | 7: iteration 38640/ 44073 | consumed samples: 19783680 | consumed tokens: 40516976640 | elapsed time per iteration (s): 4.20 | learning rate: 2.680E-05 | global batch size: 512 | lm loss: 1.927870E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.954 | TFLOPs: 56.84 | 7: iteration 38650/ 44073 | consumed samples: 19788800 | consumed tokens: 40527462400 | elapsed time per iteration (s): 4.22 | learning rate: 2.678E-05 | global batch size: 512 | lm loss: 1.947433E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.196 | TFLOPs: 56.48 | 7: iteration 38660/ 44073 | consumed samples: 19793920 | consumed tokens: 40537948160 | elapsed time per iteration (s): 4.19 | learning rate: 2.675E-05 | global batch size: 512 | lm loss: 1.919592E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.161 | TFLOPs: 56.93 | 7: iteration 38670/ 44073 | consumed samples: 19799040 | consumed tokens: 40548433920 | elapsed time per iteration (s): 4.17 | learning rate: 2.673E-05 | global batch size: 512 | lm loss: 1.942091E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.792 | TFLOPs: 57.23 | 7: iteration 38680/ 44073 | consumed samples: 19804160 | consumed tokens: 40558919680 | elapsed time per iteration (s): 4.14 | learning rate: 2.670E-05 | global batch size: 512 | lm loss: 1.924893E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.599 | TFLOPs: 57.60 | 7: iteration 38690/ 44073 | consumed samples: 19809280 | consumed tokens: 40569405440 | elapsed time per iteration (s): 4.19 | learning rate: 2.668E-05 | global batch size: 512 | lm loss: 1.937934E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.276 | TFLOPs: 56.99 | 7: iteration 38700/ 44073 | consumed samples: 19814400 | consumed tokens: 40579891200 | elapsed time per iteration (s): 4.14 | learning rate: 2.665E-05 | global batch size: 512 | lm loss: 1.921766E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.763 | TFLOPs: 57.68 | 7: iteration 38710/ 44073 | consumed samples: 19819520 | consumed tokens: 40590376960 | elapsed time per iteration (s): 4.18 | learning rate: 2.663E-05 | global batch size: 512 | lm loss: 1.913663E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.378 | TFLOPs: 57.03 | 7: iteration 38720/ 44073 | consumed samples: 19824640 | consumed tokens: 40600862720 | elapsed time per iteration (s): 4.16 | learning rate: 2.660E-05 | global batch size: 512 | lm loss: 1.927184E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.030 | TFLOPs: 57.34 | 7: iteration 38730/ 44073 | consumed samples: 19829760 | consumed tokens: 40611348480 | elapsed time per iteration (s): 4.16 | learning rate: 2.658E-05 | global batch size: 512 | lm loss: 1.903275E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.015 | TFLOPs: 57.33 | 7: iteration 38740/ 44073 | consumed samples: 19834880 | consumed tokens: 40621834240 | elapsed time per iteration (s): 4.19 | learning rate: 2.656E-05 | global batch size: 512 | lm loss: 1.930011E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.128 | TFLOPs: 56.92 | 7: iteration 38750/ 44073 | consumed samples: 19840000 | consumed tokens: 40632320000 | elapsed time per iteration (s): 4.17 | learning rate: 2.653E-05 | global batch size: 512 | lm loss: 1.923574E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.805 | TFLOPs: 57.23 | 7: iteration 38760/ 44073 | consumed samples: 19845120 | consumed tokens: 40642805760 | elapsed time per iteration (s): 4.14 | learning rate: 2.651E-05 | global batch size: 512 | lm loss: 1.937728E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.605 | TFLOPs: 57.61 | 7: iteration 38770/ 44073 | consumed samples: 19850240 | consumed tokens: 40653291520 | elapsed time per iteration (s): 4.18 | learning rate: 2.648E-05 | global batch size: 512 | lm loss: 1.921254E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.589 | TFLOPs: 57.13 | 7: iteration 38780/ 44073 | consumed samples: 19855360 | consumed tokens: 40663777280 | elapsed time per iteration (s): 4.21 | learning rate: 2.646E-05 | global batch size: 512 | lm loss: 1.922249E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.729 | TFLOPs: 56.73 | 7: iteration 38790/ 44073 | consumed samples: 19860480 | consumed tokens: 40674263040 | elapsed time per iteration (s): 4.16 | learning rate: 2.643E-05 | global batch size: 512 | lm loss: 1.945758E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.056 | TFLOPs: 57.35 | 7: iteration 38800/ 44073 | consumed samples: 19865600 | consumed tokens: 40684748800 | elapsed time per iteration (s): 4.17 | learning rate: 2.641E-05 | global batch size: 512 | lm loss: 1.918064E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.846 | TFLOPs: 57.25 | 7: iteration 38810/ 44073 | consumed samples: 19870720 | consumed tokens: 40695234560 | elapsed time per iteration (s): 4.19 | learning rate: 2.639E-05 | global batch size: 512 | lm loss: 1.930983E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.175 | TFLOPs: 56.94 | 7: iteration 38820/ 44073 | consumed samples: 19875840 | consumed tokens: 40705720320 | elapsed time per iteration (s): 4.18 | learning rate: 2.636E-05 | global batch size: 512 | lm loss: 1.929119E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.486 | TFLOPs: 57.08 | 7: iteration 38830/ 44073 | consumed samples: 19880960 | consumed tokens: 40716206080 | elapsed time per iteration (s): 4.21 | learning rate: 2.634E-05 | global batch size: 512 | lm loss: 1.935892E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.576 | TFLOPs: 56.66 | 7: iteration 38840/ 44073 | consumed samples: 19886080 | consumed tokens: 40726691840 | elapsed time per iteration (s): 4.24 | learning rate: 2.631E-05 | global batch size: 512 | lm loss: 1.918581E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.841 | TFLOPs: 56.32 | 7: iteration 38850/ 44073 | consumed samples: 19891200 | consumed tokens: 40737177600 | elapsed time per iteration (s): 4.28 | learning rate: 2.629E-05 | global batch size: 512 | lm loss: 1.918236E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.670 | TFLOPs: 55.77 | 7: iteration 38860/ 44073 | consumed samples: 19896320 | consumed tokens: 40747663360 | elapsed time per iteration (s): 4.19 | learning rate: 2.627E-05 | global batch size: 512 | lm loss: 1.936051E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.119 | TFLOPs: 56.91 | 7: iteration 38870/ 44073 | consumed samples: 19901440 | consumed tokens: 40758149120 | elapsed time per iteration (s): 4.24 | learning rate: 2.624E-05 | global batch size: 512 | lm loss: 1.902833E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.626 | TFLOPs: 56.22 | 7: iteration 38880/ 44073 | consumed samples: 19906560 | consumed tokens: 40768634880 | elapsed time per iteration (s): 4.26 | learning rate: 2.622E-05 | global batch size: 512 | lm loss: 1.918412E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.059 | TFLOPs: 55.95 | 7: iteration 38890/ 44073 | consumed samples: 19911680 | consumed tokens: 40779120640 | elapsed time per iteration (s): 4.31 | learning rate: 2.620E-05 | global batch size: 512 | lm loss: 1.925860E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.735 | TFLOPs: 55.34 | 7: iteration 38900/ 44073 | consumed samples: 19916800 | consumed tokens: 40789606400 | elapsed time per iteration (s): 4.22 | learning rate: 2.617E-05 | global batch size: 512 | lm loss: 1.925715E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.421 | TFLOPs: 56.59 | 7: iteration 38910/ 44073 | consumed samples: 19921920 | consumed tokens: 40800092160 | elapsed time per iteration (s): 4.21 | learning rate: 2.615E-05 | global batch size: 512 | lm loss: 1.902934E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.527 | TFLOPs: 56.64 | 7: iteration 38920/ 44073 | consumed samples: 19927040 | consumed tokens: 40810577920 | elapsed time per iteration (s): 4.20 | learning rate: 2.613E-05 | global batch size: 512 | lm loss: 1.932163E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.936 | TFLOPs: 56.83 | 7: iteration 38930/ 44073 | consumed samples: 19932160 | consumed tokens: 40821063680 | elapsed time per iteration (s): 4.30 | learning rate: 2.610E-05 | global batch size: 512 | lm loss: 1.920609E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.062 | TFLOPs: 55.49 | 7: iteration 38940/ 44073 | consumed samples: 19937280 | consumed tokens: 40831549440 | elapsed time per iteration (s): 4.29 | learning rate: 2.608E-05 | global batch size: 512 | lm loss: 1.933862E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.267 | TFLOPs: 55.58 | 7: iteration 38950/ 44073 | consumed samples: 19942400 | consumed tokens: 40842035200 | elapsed time per iteration (s): 4.24 | learning rate: 2.605E-05 | global batch size: 512 | lm loss: 1.925715E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.727 | TFLOPs: 56.26 | 7: iteration 38960/ 44073 | consumed samples: 19947520 | consumed tokens: 40852520960 | elapsed time per iteration (s): 4.21 | learning rate: 2.603E-05 | global batch size: 512 | lm loss: 1.920710E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.528 | TFLOPs: 56.64 | 7: iteration 38970/ 44073 | consumed samples: 19952640 | consumed tokens: 40863006720 | elapsed time per iteration (s): 4.21 | learning rate: 2.601E-05 | global batch size: 512 | lm loss: 1.909791E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.624 | TFLOPs: 56.68 | 7: iteration 38980/ 44073 | consumed samples: 19957760 | consumed tokens: 40873492480 | elapsed time per iteration (s): 4.20 | learning rate: 2.599E-05 | global batch size: 512 | lm loss: 1.937242E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.827 | TFLOPs: 56.78 | 7: iteration 38990/ 44073 | consumed samples: 19962880 | consumed tokens: 40883978240 | elapsed time per iteration (s): 4.18 | learning rate: 2.596E-05 | global batch size: 512 | lm loss: 1.924417E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.598 | TFLOPs: 57.14 | 7: iteration 39000/ 44073 | consumed samples: 19968000 | consumed tokens: 40894464000 | elapsed time per iteration (s): 4.22 | learning rate: 2.594E-05 | global batch size: 512 | lm loss: 1.938122E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.443 | TFLOPs: 56.60 | 7: ------------------------------------------------------------------------------------------- 7: valid loss at iteration 39000 | lm loss value: 1.886089E+00 | lm loss PPL: 6.593531E+00 | 7: ------------------------------------------------------------------------------------------- 0: saving checkpoint at iteration 39000 to checkpoints_2b2 0: [2022-11-27 23:02:44,372] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step39000 is begin to save! 0: [2022-11-27 23:02:44,379] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_01-model_00-model_states.pt... 0: [2022-11-27 23:02:44,732] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_01-model_00-model_states.pt. 0: [2022-11-27 23:02:44,733] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_03-model_00-model_states.pt... 0: [2022-11-27 23:02:44,873] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_03-model_00-model_states.pt. 0: [2022-11-27 23:02:44,874] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_04-model_00-model_states.pt... 0: [2022-11-27 23:02:45,020] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_04-model_00-model_states.pt. 0: [2022-11-27 23:02:45,020] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_05-model_00-model_states.pt... 0: [2022-11-27 23:02:45,161] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_05-model_00-model_states.pt. 0: [2022-11-27 23:02:45,162] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_06-model_00-model_states.pt... 0: [2022-11-27 23:02:45,306] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_06-model_00-model_states.pt. 0: [2022-11-27 23:02:45,306] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_07-model_00-model_states.pt... 0: [2022-11-27 23:02:45,448] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_07-model_00-model_states.pt. 0: [2022-11-27 23:02:45,448] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_08-model_00-model_states.pt... 0: [2022-11-27 23:02:45,593] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_08-model_00-model_states.pt. 0: [2022-11-27 23:02:45,593] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_09-model_00-model_states.pt... 0: [2022-11-27 23:02:45,737] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_09-model_00-model_states.pt. 0: [2022-11-27 23:02:45,737] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_10-model_00-model_states.pt... 0: [2022-11-27 23:02:45,876] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_10-model_00-model_states.pt. 0: [2022-11-27 23:02:45,877] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_11-model_00-model_states.pt... 0: [2022-11-27 23:02:46,022] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_11-model_00-model_states.pt. 0: [2022-11-27 23:02:46,023] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_12-model_00-model_states.pt... 0: [2022-11-27 23:02:46,162] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_12-model_00-model_states.pt. 0: [2022-11-27 23:02:46,163] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_13-model_00-model_states.pt... 0: [2022-11-27 23:02:46,300] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_13-model_00-model_states.pt. 0: [2022-11-27 23:02:46,301] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_14-model_00-model_states.pt... 0: [2022-11-27 23:02:46,438] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_14-model_00-model_states.pt. 0: [2022-11-27 23:02:46,439] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_15-model_00-model_states.pt... 0: [2022-11-27 23:02:46,580] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_15-model_00-model_states.pt. 0: [2022-11-27 23:02:46,580] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_16-model_00-model_states.pt... 0: [2022-11-27 23:02:46,717] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_16-model_00-model_states.pt. 0: [2022-11-27 23:02:46,717] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_17-model_00-model_states.pt... 0: [2022-11-27 23:02:46,855] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_17-model_00-model_states.pt. 0: [2022-11-27 23:02:46,855] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_18-model_00-model_states.pt... 0: [2022-11-27 23:02:46,992] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_18-model_00-model_states.pt. 0: [2022-11-27 23:02:46,993] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_19-model_00-model_states.pt... 0: [2022-11-27 23:02:47,133] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_19-model_00-model_states.pt. 0: [2022-11-27 23:02:47,133] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_20-model_00-model_states.pt... 0: [2022-11-27 23:02:47,267] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_20-model_00-model_states.pt. 0: [2022-11-27 23:02:47,268] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_21-model_00-model_states.pt... 0: [2022-11-27 23:02:47,405] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_21-model_00-model_states.pt. 0: [2022-11-27 23:02:47,406] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_22-model_00-model_states.pt... 0: [2022-11-27 23:02:47,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_22-model_00-model_states.pt. 0: [2022-11-27 23:02:47,543] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_23-model_00-model_states.pt... 0: [2022-11-27 23:02:47,679] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_23-model_00-model_states.pt. 0: [2022-11-27 23:02:47,679] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_24-model_00-model_states.pt... 0: [2022-11-27 23:02:47,816] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_24-model_00-model_states.pt. 0: [2022-11-27 23:02:47,816] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_25-model_00-model_states.pt... 0: [2022-11-27 23:02:48,082] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_25-model_00-model_states.pt. 0: [2022-11-27 23:02:48,083] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_26-model_00-model_states.pt... 0: [2022-11-27 23:02:48,217] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_26-model_00-model_states.pt. 0: [2022-11-27 23:02:48,218] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_27-model_00-model_states.pt... 0: [2022-11-27 23:02:48,357] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_27-model_00-model_states.pt. 0: [2022-11-27 23:02:48,357] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_28-model_00-model_states.pt... 0: [2022-11-27 23:02:48,488] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_28-model_00-model_states.pt. 0: [2022-11-27 23:02:48,488] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_29-model_00-model_states.pt... 0: [2022-11-27 23:02:48,623] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_29-model_00-model_states.pt. 0: [2022-11-27 23:02:48,624] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_30-model_00-model_states.pt... 0: [2022-11-27 23:02:48,761] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_30-model_00-model_states.pt. 0: [2022-11-27 23:02:48,761] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_31-model_00-model_states.pt... 0: [2022-11-27 23:02:48,893] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_31-model_00-model_states.pt. 0: [2022-11-27 23:02:48,893] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_32-model_00-model_states.pt... 0: [2022-11-27 23:02:49,028] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_32-model_00-model_states.pt. 0: [2022-11-27 23:02:49,029] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_33-model_00-model_states.pt... 0: [2022-11-27 23:02:49,162] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_33-model_00-model_states.pt. 0: [2022-11-27 23:02:49,162] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_34-model_00-model_states.pt... 0: [2022-11-27 23:02:49,297] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_34-model_00-model_states.pt. 0: [2022-11-27 23:02:49,298] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/layer_36-model_00-model_states.pt... 0: [2022-11-27 23:02:49,304] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/layer_36-model_00-model_states.pt. 0: [2022-11-27 23:02:49,305] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step39000/mp_rank_00_model_states.pt 0: [2022-11-27 23:02:49,305] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/mp_rank_00_model_states.pt... 0: [2022-11-27 23:02:49,311] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/mp_rank_00_model_states.pt. 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 6: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 2: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 3: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 1: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 7: [2022-11-27 23:02:49,333] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step39000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 0: [2022-11-27 23:02:49,862] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,863] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:49,863] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 0: [2022-11-27 23:02:49,866] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,874] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,874] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:49,874] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 0: [2022-11-27 23:02:49,884] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,884] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:49,884] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 0: [2022-11-27 23:02:49,884] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,884] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:49,884] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 0: [2022-11-27 23:02:49,890] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,890] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:49,890] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,894] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,894] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,894] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,899] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,899] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,899] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,900] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,900] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,900] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,918] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,919] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,919] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 0: [2022-11-27 23:02:49,923] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,923] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:49,923] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 0: [2022-11-27 23:02:49,925] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:49,925] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:49,925] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,932] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,932] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,932] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,932] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 4: [2022-11-27 23:02:49,933] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-27 23:02:49,933] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-27 23:02:49,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,028] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,028] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,028] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,028] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,028] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-27 23:02:50,029] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,029] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,029] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,029] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 1: [2022-11-27 23:02:50,029] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,257] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,257] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,257] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,258] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,258] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,258] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,259] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,259] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,259] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,259] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,259] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,259] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,260] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,260] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,260] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-27 23:02:50,260] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-27 23:02:50,260] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,266] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-27 23:02:50,267] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-27 23:02:50,267] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,267] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,267] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,267] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,267] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,277] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-27 23:02:50,277] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,278] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,278] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 0: [2022-11-27 23:02:50,281] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-27 23:02:50,281] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,278] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,278] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 5: [2022-11-27 23:02:50,278] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-27 23:02:50,278] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-27 23:02:50,278] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,325] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,325] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,325] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,325] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,325] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,343] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 3: [2022-11-27 23:02:50,343] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,351] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,351] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,351] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,351] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,352] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,352] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,352] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,352] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,352] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 7: [2022-11-27 23:02:50,353] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-27 23:02:50,353] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 7: [2022-11-27 23:02:50,353] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,598] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,598] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,598] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,598] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,598] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,598] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,598] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,598] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 6: [2022-11-27 23:02:50,601] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-27 23:02:50,601] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step39000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 6: [2022-11-27 23:02:50,601] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step39000 is ready now! 0: successfully saved checkpoint at iteration 39000 to checkpoints_2b2 7: time (ms) | save-checkpoint: 6254.65 7: iteration 39010/ 44073 | consumed samples: 19973120 | consumed tokens: 40904949760 | elapsed time per iteration (s): 4.95 | learning rate: 2.592E-05 | global batch size: 512 | lm loss: 1.916953E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 103.460 | TFLOPs: 48.22 | 7: iteration 39020/ 44073 | consumed samples: 19978240 | consumed tokens: 40915435520 | elapsed time per iteration (s): 4.18 | learning rate: 2.589E-05 | global batch size: 512 | lm loss: 1.926159E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.348 | TFLOPs: 57.02 | 7: iteration 39030/ 44073 | consumed samples: 19983360 | consumed tokens: 40925921280 | elapsed time per iteration (s): 4.22 | learning rate: 2.587E-05 | global batch size: 512 | lm loss: 1.921272E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.274 | TFLOPs: 56.52 | 7: iteration 39040/ 44073 | consumed samples: 19988480 | consumed tokens: 40936407040 | elapsed time per iteration (s): 4.22 | learning rate: 2.585E-05 | global batch size: 512 | lm loss: 1.939381E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.296 | TFLOPs: 56.53 | 7: iteration 39050/ 44073 | consumed samples: 19993600 | consumed tokens: 40946892800 | elapsed time per iteration (s): 4.18 | learning rate: 2.582E-05 | global batch size: 512 | lm loss: 1.924385E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.624 | TFLOPs: 57.15 | 7: iteration 39060/ 44073 | consumed samples: 19998720 | consumed tokens: 40957378560 | elapsed time per iteration (s): 4.16 | learning rate: 2.580E-05 | global batch size: 512 | lm loss: 1.930425E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.943 | TFLOPs: 57.30 | 7: iteration 39070/ 44073 | consumed samples: 20003840 | consumed tokens: 40967864320 | elapsed time per iteration (s): 4.19 | learning rate: 2.578E-05 | global batch size: 512 | lm loss: 1.927873E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.191 | TFLOPs: 56.95 | 7: iteration 39080/ 44073 | consumed samples: 20008960 | consumed tokens: 40978350080 | elapsed time per iteration (s): 4.21 | learning rate: 2.575E-05 | global batch size: 512 | lm loss: 1.910813E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.571 | TFLOPs: 56.66 | 7: iteration 39090/ 44073 | consumed samples: 20014080 | consumed tokens: 40988835840 | elapsed time per iteration (s): 4.18 | learning rate: 2.573E-05 | global batch size: 512 | lm loss: 1.907556E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.504 | TFLOPs: 57.09 | 7: iteration 39100/ 44073 | consumed samples: 20019200 | consumed tokens: 40999321600 | elapsed time per iteration (s): 4.20 | learning rate: 2.571E-05 | global batch size: 512 | lm loss: 1.919905E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.902 | TFLOPs: 56.81 | 7: iteration 39110/ 44073 | consumed samples: 20024320 | consumed tokens: 41009807360 | elapsed time per iteration (s): 4.22 | learning rate: 2.569E-05 | global batch size: 512 | lm loss: 1.909152E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.205 | TFLOPs: 56.49 | 7: iteration 39120/ 44073 | consumed samples: 20029440 | consumed tokens: 41020293120 | elapsed time per iteration (s): 4.20 | learning rate: 2.566E-05 | global batch size: 512 | lm loss: 1.921591E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.025 | TFLOPs: 56.87 | 7: iteration 39130/ 44073 | consumed samples: 20034560 | consumed tokens: 41030778880 | elapsed time per iteration (s): 4.22 | learning rate: 2.564E-05 | global batch size: 512 | lm loss: 1.936654E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.423 | TFLOPs: 56.59 | 7: iteration 39140/ 44073 | consumed samples: 20039680 | consumed tokens: 41041264640 | elapsed time per iteration (s): 4.22 | learning rate: 2.562E-05 | global batch size: 512 | lm loss: 1.925270E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.295 | TFLOPs: 56.53 | 7: iteration 39150/ 44073 | consumed samples: 20044800 | consumed tokens: 41051750400 | elapsed time per iteration (s): 4.25 | learning rate: 2.560E-05 | global batch size: 512 | lm loss: 1.918249E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.557 | TFLOPs: 56.19 | 7: iteration 39160/ 44073 | consumed samples: 20049920 | consumed tokens: 41062236160 | elapsed time per iteration (s): 4.23 | learning rate: 2.557E-05 | global batch size: 512 | lm loss: 1.926269E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.154 | TFLOPs: 56.46 | 7: iteration 39170/ 44073 | consumed samples: 20055040 | consumed tokens: 41072721920 | elapsed time per iteration (s): 4.17 | learning rate: 2.555E-05 | global batch size: 512 | lm loss: 1.924642E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.895 | TFLOPs: 57.28 | 7: iteration 39180/ 44073 | consumed samples: 20060160 | consumed tokens: 41083207680 | elapsed time per iteration (s): 4.26 | learning rate: 2.553E-05 | global batch size: 512 | lm loss: 1.912642E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.150 | TFLOPs: 56.00 | 7: iteration 39190/ 44073 | consumed samples: 20065280 | consumed tokens: 41093693440 | elapsed time per iteration (s): 4.20 | learning rate: 2.551E-05 | global batch size: 512 | lm loss: 1.935497E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.939 | TFLOPs: 56.83 | 7: iteration 39200/ 44073 | consumed samples: 20070400 | consumed tokens: 41104179200 | elapsed time per iteration (s): 4.19 | learning rate: 2.548E-05 | global batch size: 512 | lm loss: 1.905778E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.222 | TFLOPs: 56.96 | 7: iteration 39210/ 44073 | consumed samples: 20075520 | consumed tokens: 41114664960 | elapsed time per iteration (s): 4.24 | learning rate: 2.546E-05 | global batch size: 512 | lm loss: 1.932123E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.813 | TFLOPs: 56.30 | 7: iteration 39220/ 44073 | consumed samples: 20080640 | consumed tokens: 41125150720 | elapsed time per iteration (s): 4.23 | learning rate: 2.544E-05 | global batch size: 512 | lm loss: 1.927623E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.905 | TFLOPs: 56.35 | 7: iteration 39230/ 44073 | consumed samples: 20085760 | consumed tokens: 41135636480 | elapsed time per iteration (s): 4.19 | learning rate: 2.542E-05 | global batch size: 512 | lm loss: 1.917931E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.183 | TFLOPs: 56.94 | 7: iteration 39240/ 44073 | consumed samples: 20090880 | consumed tokens: 41146122240 | elapsed time per iteration (s): 4.23 | learning rate: 2.540E-05 | global batch size: 512 | lm loss: 1.938144E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.982 | TFLOPs: 56.38 | 7: iteration 39250/ 44073 | consumed samples: 20096000 | consumed tokens: 41156608000 | elapsed time per iteration (s): 4.19 | learning rate: 2.537E-05 | global batch size: 512 | lm loss: 1.943262E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.181 | TFLOPs: 56.94 | 7: iteration 39260/ 44073 | consumed samples: 20101120 | consumed tokens: 41167093760 | elapsed time per iteration (s): 4.19 | learning rate: 2.535E-05 | global batch size: 512 | lm loss: 1.933616E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.245 | TFLOPs: 56.97 | 7: iteration 39270/ 44073 | consumed samples: 20106240 | consumed tokens: 41177579520 | elapsed time per iteration (s): 4.18 | learning rate: 2.533E-05 | global batch size: 512 | lm loss: 1.941059E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.383 | TFLOPs: 57.04 | 7: iteration 39280/ 44073 | consumed samples: 20111360 | consumed tokens: 41188065280 | elapsed time per iteration (s): 4.19 | learning rate: 2.531E-05 | global batch size: 512 | lm loss: 1.930526E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.204 | TFLOPs: 56.95 | 7: iteration 39290/ 44073 | consumed samples: 20116480 | consumed tokens: 41198551040 | elapsed time per iteration (s): 4.21 | learning rate: 2.529E-05 | global batch size: 512 | lm loss: 1.911282E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.616 | TFLOPs: 56.68 | 7: iteration 39300/ 44073 | consumed samples: 20121600 | consumed tokens: 41209036800 | elapsed time per iteration (s): 4.29 | learning rate: 2.526E-05 | global batch size: 512 | lm loss: 1.899951E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.292 | TFLOPs: 55.60 | 7: iteration 39310/ 44073 | consumed samples: 20126720 | consumed tokens: 41219522560 | elapsed time per iteration (s): 4.25 | learning rate: 2.524E-05 | global batch size: 512 | lm loss: 1.928822E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.552 | TFLOPs: 56.18 | 7: iteration 39320/ 44073 | consumed samples: 20131840 | consumed tokens: 41230008320 | elapsed time per iteration (s): 4.23 | learning rate: 2.522E-05 | global batch size: 512 | lm loss: 1.923578E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.983 | TFLOPs: 56.38 | 7: iteration 39330/ 44073 | consumed samples: 20136960 | consumed tokens: 41240494080 | elapsed time per iteration (s): 4.22 | learning rate: 2.520E-05 | global batch size: 512 | lm loss: 1.932594E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.250 | TFLOPs: 56.51 | 7: iteration 39340/ 44073 | consumed samples: 20142080 | consumed tokens: 41250979840 | elapsed time per iteration (s): 4.19 | learning rate: 2.518E-05 | global batch size: 512 | lm loss: 1.928119E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.222 | TFLOPs: 56.96 | 7: iteration 39350/ 44073 | consumed samples: 20147200 | consumed tokens: 41261465600 | elapsed time per iteration (s): 4.22 | learning rate: 2.516E-05 | global batch size: 512 | lm loss: 1.925199E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.459 | TFLOPs: 56.61 | 7: iteration 39360/ 44073 | consumed samples: 20152320 | consumed tokens: 41271951360 | elapsed time per iteration (s): 4.18 | learning rate: 2.513E-05 | global batch size: 512 | lm loss: 1.919434E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.360 | TFLOPs: 57.03 | 7: iteration 39370/ 44073 | consumed samples: 20157440 | consumed tokens: 41282437120 | elapsed time per iteration (s): 4.20 | learning rate: 2.511E-05 | global batch size: 512 | lm loss: 1.933350E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.960 | TFLOPs: 56.84 | 7: iteration 39380/ 44073 | consumed samples: 20162560 | consumed tokens: 41292922880 | elapsed time per iteration (s): 4.20 | learning rate: 2.509E-05 | global batch size: 512 | lm loss: 1.912759E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.932 | TFLOPs: 56.83 | 7: iteration 39390/ 44073 | consumed samples: 20167680 | consumed tokens: 41303408640 | elapsed time per iteration (s): 4.17 | learning rate: 2.507E-05 | global batch size: 512 | lm loss: 1.927806E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.816 | TFLOPs: 57.24 | 7: iteration 39400/ 44073 | consumed samples: 20172800 | consumed tokens: 41313894400 | elapsed time per iteration (s): 4.22 | learning rate: 2.505E-05 | global batch size: 512 | lm loss: 1.891027E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.470 | TFLOPs: 56.61 | 7: iteration 39410/ 44073 | consumed samples: 20177920 | consumed tokens: 41324380160 | elapsed time per iteration (s): 4.20 | learning rate: 2.503E-05 | global batch size: 512 | lm loss: 1.926397E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.968 | TFLOPs: 56.84 | 7: iteration 39420/ 44073 | consumed samples: 20183040 | consumed tokens: 41334865920 | elapsed time per iteration (s): 4.25 | learning rate: 2.500E-05 | global batch size: 512 | lm loss: 1.934025E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.557 | TFLOPs: 56.19 | 7: iteration 39430/ 44073 | consumed samples: 20188160 | consumed tokens: 41345351680 | elapsed time per iteration (s): 4.19 | learning rate: 2.498E-05 | global batch size: 512 | lm loss: 1.937855E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.221 | TFLOPs: 56.96 | 7: iteration 39440/ 44073 | consumed samples: 20193280 | consumed tokens: 41355837440 | elapsed time per iteration (s): 4.17 | learning rate: 2.496E-05 | global batch size: 512 | lm loss: 1.910222E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.731 | TFLOPs: 57.20 | 7: iteration 39450/ 44073 | consumed samples: 20198400 | consumed tokens: 41366323200 | elapsed time per iteration (s): 4.27 | learning rate: 2.494E-05 | global batch size: 512 | lm loss: 1.932644E+00 | grad norm: 0.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.841 | TFLOPs: 55.85 | 7: iteration 39460/ 44073 | consumed samples: 20203520 | consumed tokens: 41376808960 | elapsed time per iteration (s): 4.24 | learning rate: 2.492E-05 | global batch size: 512 | lm loss: 1.926309E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.643 | TFLOPs: 56.23 | 7: iteration 39470/ 44073 | consumed samples: 20208640 | consumed tokens: 41387294720 | elapsed time per iteration (s): 4.23 | learning rate: 2.490E-05 | global batch size: 512 | lm loss: 1.933758E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.995 | TFLOPs: 56.39 | 7: iteration 39480/ 44073 | consumed samples: 20213760 | consumed tokens: 41397780480 | elapsed time per iteration (s): 4.21 | learning rate: 2.488E-05 | global batch size: 512 | lm loss: 1.916521E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.640 | TFLOPs: 56.69 | 7: iteration 39490/ 44073 | consumed samples: 20218880 | consumed tokens: 41408266240 | elapsed time per iteration (s): 4.21 | learning rate: 2.486E-05 | global batch size: 512 | lm loss: 1.921593E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.747 | TFLOPs: 56.74 | 7: iteration 39500/ 44073 | consumed samples: 20224000 | consumed tokens: 41418752000 | elapsed time per iteration (s): 4.23 | learning rate: 2.484E-05 | global batch size: 512 | lm loss: 1.915133E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.927 | TFLOPs: 56.36 | 7: iteration 39510/ 44073 | consumed samples: 20229120 | consumed tokens: 41429237760 | elapsed time per iteration (s): 4.18 | learning rate: 2.481E-05 | global batch size: 512 | lm loss: 1.927244E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.346 | TFLOPs: 57.02 | 7: iteration 39520/ 44073 | consumed samples: 20234240 | consumed tokens: 41439723520 | elapsed time per iteration (s): 4.24 | learning rate: 2.479E-05 | global batch size: 512 | lm loss: 1.925750E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.880 | TFLOPs: 56.34 | 7: iteration 39530/ 44073 | consumed samples: 20239360 | consumed tokens: 41450209280 | elapsed time per iteration (s): 4.21 | learning rate: 2.477E-05 | global batch size: 512 | lm loss: 1.934600E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.746 | TFLOPs: 56.74 | 7: iteration 39540/ 44073 | consumed samples: 20244480 | consumed tokens: 41460695040 | elapsed time per iteration (s): 4.18 | learning rate: 2.475E-05 | global batch size: 512 | lm loss: 1.907695E+00 | grad norm: 0.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.531 | TFLOPs: 57.11 | 7: iteration 39550/ 44073 | consumed samples: 20249600 | consumed tokens: 41471180800 | elapsed time per iteration (s): 4.22 | learning rate: 2.473E-05 | global batch size: 512 | lm loss: 1.919822E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.214 | TFLOPs: 56.49 | 7: iteration 39560/ 44073 | consumed samples: 20254720 | consumed tokens: 41481666560 | elapsed time per iteration (s): 4.21 | learning rate: 2.471E-05 | global batch size: 512 | lm loss: 1.934052E+00 | grad norm: 0.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.739 | TFLOPs: 56.74 | 7: iteration 39570/ 44073 | consumed samples: 20259840 | consumed tokens: 41492152320 | elapsed time per iteration (s): 4.18 | learning rate: 2.469E-05 | global batch size: 512 | lm loss: 1.925095E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.561 | TFLOPs: 57.12 | 7: iteration 39580/ 44073 | consumed samples: 20264960 | consumed tokens: 41502638080 | elapsed time per iteration (s): 4.20 | learning rate: 2.467E-05 | global batch size: 512 | lm loss: 1.935816E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.906 | TFLOPs: 56.81 | 7: iteration 39590/ 44073 | consumed samples: 20270080 | consumed tokens: 41513123840 | elapsed time per iteration (s): 4.25 | learning rate: 2.465E-05 | global batch size: 512 | lm loss: 1.911768E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.568 | TFLOPs: 56.19 | 7: iteration 39600/ 44073 | consumed samples: 20275200 | consumed tokens: 41523609600 | elapsed time per iteration (s): 4.22 | learning rate: 2.463E-05 | global batch size: 512 | lm loss: 1.932966E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.434 | TFLOPs: 56.59 | 7: iteration 39610/ 44073 | consumed samples: 20280320 | consumed tokens: 41534095360 | elapsed time per iteration (s): 4.23 | learning rate: 2.461E-05 | global batch size: 512 | lm loss: 1.931249E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.955 | TFLOPs: 56.37 | 7: iteration 39620/ 44073 | consumed samples: 20285440 | consumed tokens: 41544581120 | elapsed time per iteration (s): 4.26 | learning rate: 2.459E-05 | global batch size: 512 | lm loss: 1.918491E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.197 | TFLOPs: 56.02 | 7: iteration 39630/ 44073 | consumed samples: 20290560 | consumed tokens: 41555066880 | elapsed time per iteration (s): 4.16 | learning rate: 2.457E-05 | global batch size: 512 | lm loss: 1.926069E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.181 | TFLOPs: 57.41 | 7: iteration 39640/ 44073 | consumed samples: 20295680 | consumed tokens: 41565552640 | elapsed time per iteration (s): 4.21 | learning rate: 2.455E-05 | global batch size: 512 | lm loss: 1.929642E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.603 | TFLOPs: 56.67 | 7: iteration 39650/ 44073 | consumed samples: 20300800 | consumed tokens: 41576038400 | elapsed time per iteration (s): 4.20 | learning rate: 2.453E-05 | global batch size: 512 | lm loss: 1.938276E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.776 | TFLOPs: 56.75 | 7: iteration 39660/ 44073 | consumed samples: 20305920 | consumed tokens: 41586524160 | elapsed time per iteration (s): 4.17 | learning rate: 2.451E-05 | global batch size: 512 | lm loss: 1.900183E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.669 | TFLOPs: 57.17 | 7: iteration 39670/ 44073 | consumed samples: 20311040 | consumed tokens: 41597009920 | elapsed time per iteration (s): 4.23 | learning rate: 2.449E-05 | global batch size: 512 | lm loss: 1.932773E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.166 | TFLOPs: 56.47 | 7: iteration 39680/ 44073 | consumed samples: 20316160 | consumed tokens: 41607495680 | elapsed time per iteration (s): 4.18 | learning rate: 2.447E-05 | global batch size: 512 | lm loss: 1.932882E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.540 | TFLOPs: 57.11 | 7: iteration 39690/ 44073 | consumed samples: 20321280 | consumed tokens: 41617981440 | elapsed time per iteration (s): 4.19 | learning rate: 2.445E-05 | global batch size: 512 | lm loss: 1.920295E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.333 | TFLOPs: 57.01 | 7: iteration 39700/ 44073 | consumed samples: 20326400 | consumed tokens: 41628467200 | elapsed time per iteration (s): 4.21 | learning rate: 2.443E-05 | global batch size: 512 | lm loss: 1.950233E+00 | grad norm: 0.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.530 | TFLOPs: 56.64 | 7: iteration 39710/ 44073 | consumed samples: 20331520 | consumed tokens: 41638952960 | elapsed time per iteration (s): 4.26 | learning rate: 2.441E-05 | global batch size: 512 | lm loss: 1.935165E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.325 | TFLOPs: 56.08 | 7: iteration 39720/ 44073 | consumed samples: 20336640 | consumed tokens: 41649438720 | elapsed time per iteration (s): 4.19 | learning rate: 2.439E-05 | global batch size: 512 | lm loss: 1.929373E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.214 | TFLOPs: 56.96 | 7: iteration 39730/ 44073 | consumed samples: 20341760 | consumed tokens: 41659924480 | elapsed time per iteration (s): 4.16 | learning rate: 2.437E-05 | global batch size: 512 | lm loss: 1.918111E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.066 | TFLOPs: 57.35 | 7: iteration 39740/ 44073 | consumed samples: 20346880 | consumed tokens: 41670410240 | elapsed time per iteration (s): 4.22 | learning rate: 2.435E-05 | global batch size: 512 | lm loss: 1.924480E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.394 | TFLOPs: 56.58 | 7: iteration 39750/ 44073 | consumed samples: 20352000 | consumed tokens: 41680896000 | elapsed time per iteration (s): 4.21 | learning rate: 2.433E-05 | global batch size: 512 | lm loss: 1.918319E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.580 | TFLOPs: 56.66 | 7: iteration 39760/ 44073 | consumed samples: 20357120 | consumed tokens: 41691381760 | elapsed time per iteration (s): 4.43 | learning rate: 2.431E-05 | global batch size: 512 | lm loss: 1.910712E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 115.641 | TFLOPs: 53.89 | 7: iteration 39770/ 44073 | consumed samples: 20362240 | consumed tokens: 41701867520 | elapsed time per iteration (s): 4.27 | learning rate: 2.429E-05 | global batch size: 512 | lm loss: 1.907289E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.022 | TFLOPs: 55.94 | 7: iteration 39780/ 44073 | consumed samples: 20367360 | consumed tokens: 41712353280 | elapsed time per iteration (s): 4.27 | learning rate: 2.427E-05 | global batch size: 512 | lm loss: 1.927835E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.046 | TFLOPs: 55.95 | 7: iteration 39790/ 44073 | consumed samples: 20372480 | consumed tokens: 41722839040 | elapsed time per iteration (s): 4.22 | learning rate: 2.425E-05 | global batch size: 512 | lm loss: 1.934272E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.397 | TFLOPs: 56.58 | 7: iteration 39800/ 44073 | consumed samples: 20377600 | consumed tokens: 41733324800 | elapsed time per iteration (s): 4.27 | learning rate: 2.423E-05 | global batch size: 512 | lm loss: 1.931775E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.956 | TFLOPs: 55.91 | 7: iteration 39810/ 44073 | consumed samples: 20382720 | consumed tokens: 41743810560 | elapsed time per iteration (s): 4.26 | learning rate: 2.421E-05 | global batch size: 512 | lm loss: 1.942515E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.150 | TFLOPs: 56.00 | 7: iteration 39820/ 44073 | consumed samples: 20387840 | consumed tokens: 41754296320 | elapsed time per iteration (s): 4.21 | learning rate: 2.419E-05 | global batch size: 512 | lm loss: 1.930401E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.667 | TFLOPs: 56.70 | 7: iteration 39830/ 44073 | consumed samples: 20392960 | consumed tokens: 41764782080 | elapsed time per iteration (s): 4.20 | learning rate: 2.417E-05 | global batch size: 512 | lm loss: 1.926381E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.775 | TFLOPs: 56.75 | 7: iteration 39840/ 44073 | consumed samples: 20398080 | consumed tokens: 41775267840 | elapsed time per iteration (s): 4.22 | learning rate: 2.415E-05 | global batch size: 512 | lm loss: 1.915784E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.249 | TFLOPs: 56.51 | 7: iteration 39850/ 44073 | consumed samples: 20403200 | consumed tokens: 41785753600 | elapsed time per iteration (s): 4.24 | learning rate: 2.413E-05 | global batch size: 512 | lm loss: 1.927146E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.665 | TFLOPs: 56.24 | 7: iteration 39860/ 44073 | consumed samples: 20408320 | consumed tokens: 41796239360 | elapsed time per iteration (s): 4.19 | learning rate: 2.411E-05 | global batch size: 512 | lm loss: 1.911377E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.282 | TFLOPs: 56.99 | 7: iteration 39870/ 44073 | consumed samples: 20413440 | consumed tokens: 41806725120 | elapsed time per iteration (s): 4.30 | learning rate: 2.409E-05 | global batch size: 512 | lm loss: 1.896046E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.030 | TFLOPs: 55.47 | 7: iteration 39880/ 44073 | consumed samples: 20418560 | consumed tokens: 41817210880 | elapsed time per iteration (s): 4.18 | learning rate: 2.407E-05 | global batch size: 512 | lm loss: 1.941431E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.460 | TFLOPs: 57.07 | 7: iteration 39890/ 44073 | consumed samples: 20423680 | consumed tokens: 41827696640 | elapsed time per iteration (s): 4.18 | learning rate: 2.405E-05 | global batch size: 512 | lm loss: 1.925300E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.345 | TFLOPs: 57.02 | 7: iteration 39900/ 44073 | consumed samples: 20428800 | consumed tokens: 41838182400 | elapsed time per iteration (s): 4.26 | learning rate: 2.403E-05 | global batch size: 512 | lm loss: 1.917101E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.216 | TFLOPs: 56.03 | 7: iteration 39910/ 44073 | consumed samples: 20433920 | consumed tokens: 41848668160 | elapsed time per iteration (s): 4.21 | learning rate: 2.401E-05 | global batch size: 512 | lm loss: 1.926009E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.584 | TFLOPs: 56.66 | 7: iteration 39920/ 44073 | consumed samples: 20439040 | consumed tokens: 41859153920 | elapsed time per iteration (s): 4.18 | learning rate: 2.399E-05 | global batch size: 512 | lm loss: 1.917693E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.388 | TFLOPs: 57.04 | 7: iteration 39930/ 44073 | consumed samples: 20444160 | consumed tokens: 41869639680 | elapsed time per iteration (s): 4.23 | learning rate: 2.398E-05 | global batch size: 512 | lm loss: 1.887901E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.941 | TFLOPs: 56.36 | 7: iteration 39940/ 44073 | consumed samples: 20449280 | consumed tokens: 41880125440 | elapsed time per iteration (s): 4.31 | learning rate: 2.396E-05 | global batch size: 512 | lm loss: 1.918252E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.863 | TFLOPs: 55.40 | 7: iteration 39950/ 44073 | consumed samples: 20454400 | consumed tokens: 41890611200 | elapsed time per iteration (s): 4.24 | learning rate: 2.394E-05 | global batch size: 512 | lm loss: 1.935348E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.702 | TFLOPs: 56.25 | 7: iteration 39960/ 44073 | consumed samples: 20459520 | consumed tokens: 41901096960 | elapsed time per iteration (s): 4.20 | learning rate: 2.392E-05 | global batch size: 512 | lm loss: 1.916404E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.987 | TFLOPs: 56.85 | 7: iteration 39970/ 44073 | consumed samples: 20464640 | consumed tokens: 41911582720 | elapsed time per iteration (s): 4.23 | learning rate: 2.390E-05 | global batch size: 512 | lm loss: 1.903018E+00 | grad norm: 0.115 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.945 | TFLOPs: 56.37 | 7: iteration 39980/ 44073 | consumed samples: 20469760 | consumed tokens: 41922068480 | elapsed time per iteration (s): 4.30 | learning rate: 2.388E-05 | global batch size: 512 | lm loss: 1.912196E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.956 | TFLOPs: 55.44 | 7: iteration 39990/ 44073 | consumed samples: 20474880 | consumed tokens: 41932554240 | elapsed time per iteration (s): 4.22 | learning rate: 2.386E-05 | global batch size: 512 | lm loss: 1.918019E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.376 | TFLOPs: 56.57 | 0: [2022-11-28 00:13:07,359] [INFO] [logging.py:68:log_dist] [Rank 0] step=40000, skipped=0, lr=[2.3843524443949956e-05, 2.3843524443949956e-05, 2.3843524443949956e-05], mom=[(0.9, 0.999), (0.9, 0.999), (0.9, 0.999)] 7: iteration 40000/ 44073 | consumed samples: 20480000 | consumed tokens: 41943040000 | elapsed time per iteration (s): 4.19 | learning rate: 2.384E-05 | global batch size: 512 | lm loss: 1.936924E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.188 | TFLOPs: 56.95 | 0: steps: 40000 loss: 1.9093 iter time (s): 4.202 samples/sec: 121.843 7: ------------------------------------------------------------------------------------------- 7: valid loss at iteration 40000 | lm loss value: 1.899359E+00 | lm loss PPL: 6.681613E+00 | 7: ------------------------------------------------------------------------------------------- 0: saving checkpoint at iteration 40000 to checkpoints_2b2 0: [2022-11-28 00:13:08,737] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step40000 is begin to save! 0: [2022-11-28 00:13:08,772] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_01-model_00-model_states.pt... 0: [2022-11-28 00:13:09,239] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_01-model_00-model_states.pt. 0: [2022-11-28 00:13:09,239] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_03-model_00-model_states.pt... 0: [2022-11-28 00:13:09,380] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_03-model_00-model_states.pt. 0: [2022-11-28 00:13:09,381] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_04-model_00-model_states.pt... 0: [2022-11-28 00:13:09,524] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_04-model_00-model_states.pt. 0: [2022-11-28 00:13:09,524] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_05-model_00-model_states.pt... 0: [2022-11-28 00:13:09,665] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_05-model_00-model_states.pt. 0: [2022-11-28 00:13:09,666] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_06-model_00-model_states.pt... 0: [2022-11-28 00:13:09,811] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_06-model_00-model_states.pt. 0: [2022-11-28 00:13:09,812] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_07-model_00-model_states.pt... 0: [2022-11-28 00:13:09,956] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_07-model_00-model_states.pt. 0: [2022-11-28 00:13:09,957] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_08-model_00-model_states.pt... 0: [2022-11-28 00:13:10,099] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_08-model_00-model_states.pt. 0: [2022-11-28 00:13:10,100] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_09-model_00-model_states.pt... 0: [2022-11-28 00:13:10,242] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_09-model_00-model_states.pt. 0: [2022-11-28 00:13:10,243] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_10-model_00-model_states.pt... 0: [2022-11-28 00:13:10,388] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_10-model_00-model_states.pt. 0: [2022-11-28 00:13:10,388] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_11-model_00-model_states.pt... 0: [2022-11-28 00:13:10,530] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_11-model_00-model_states.pt. 0: [2022-11-28 00:13:10,530] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_12-model_00-model_states.pt... 0: [2022-11-28 00:13:10,668] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_12-model_00-model_states.pt. 0: [2022-11-28 00:13:10,669] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_13-model_00-model_states.pt... 0: [2022-11-28 00:13:10,810] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_13-model_00-model_states.pt. 0: [2022-11-28 00:13:10,811] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_14-model_00-model_states.pt... 0: [2022-11-28 00:13:10,946] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_14-model_00-model_states.pt. 0: [2022-11-28 00:13:10,946] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_15-model_00-model_states.pt... 0: [2022-11-28 00:13:11,089] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_15-model_00-model_states.pt. 0: [2022-11-28 00:13:11,089] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_16-model_00-model_states.pt... 0: [2022-11-28 00:13:11,227] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_16-model_00-model_states.pt. 0: [2022-11-28 00:13:11,227] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_17-model_00-model_states.pt... 0: [2022-11-28 00:13:11,363] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_17-model_00-model_states.pt. 0: [2022-11-28 00:13:11,363] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_18-model_00-model_states.pt... 0: [2022-11-28 00:13:11,504] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_18-model_00-model_states.pt. 0: [2022-11-28 00:13:11,505] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_19-model_00-model_states.pt... 0: [2022-11-28 00:13:11,643] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_19-model_00-model_states.pt. 0: [2022-11-28 00:13:11,643] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_20-model_00-model_states.pt... 0: [2022-11-28 00:13:11,783] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_20-model_00-model_states.pt. 0: [2022-11-28 00:13:11,784] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_21-model_00-model_states.pt... 0: [2022-11-28 00:13:11,921] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_21-model_00-model_states.pt. 0: [2022-11-28 00:13:11,922] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_22-model_00-model_states.pt... 0: [2022-11-28 00:13:12,061] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_22-model_00-model_states.pt. 0: [2022-11-28 00:13:12,062] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_23-model_00-model_states.pt... 0: [2022-11-28 00:13:12,200] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_23-model_00-model_states.pt. 0: [2022-11-28 00:13:12,200] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_24-model_00-model_states.pt... 0: [2022-11-28 00:13:12,334] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_24-model_00-model_states.pt. 0: [2022-11-28 00:13:12,334] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_25-model_00-model_states.pt... 0: [2022-11-28 00:13:12,482] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_25-model_00-model_states.pt. 0: [2022-11-28 00:13:12,482] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_26-model_00-model_states.pt... 0: [2022-11-28 00:13:12,615] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_26-model_00-model_states.pt. 0: [2022-11-28 00:13:12,616] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_27-model_00-model_states.pt... 0: [2022-11-28 00:13:12,753] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_27-model_00-model_states.pt. 0: [2022-11-28 00:13:12,754] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_28-model_00-model_states.pt... 0: [2022-11-28 00:13:12,889] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_28-model_00-model_states.pt. 0: [2022-11-28 00:13:12,889] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_29-model_00-model_states.pt... 0: [2022-11-28 00:13:13,022] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_29-model_00-model_states.pt. 0: [2022-11-28 00:13:13,023] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_30-model_00-model_states.pt... 0: [2022-11-28 00:13:13,158] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_30-model_00-model_states.pt. 0: [2022-11-28 00:13:13,159] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_31-model_00-model_states.pt... 0: [2022-11-28 00:13:13,300] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_31-model_00-model_states.pt. 0: [2022-11-28 00:13:13,301] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_32-model_00-model_states.pt... 0: [2022-11-28 00:13:13,433] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_32-model_00-model_states.pt. 0: [2022-11-28 00:13:13,433] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_33-model_00-model_states.pt... 0: [2022-11-28 00:13:13,569] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_33-model_00-model_states.pt. 0: [2022-11-28 00:13:13,570] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_34-model_00-model_states.pt... 0: [2022-11-28 00:13:13,705] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_34-model_00-model_states.pt. 0: [2022-11-28 00:13:13,706] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/layer_36-model_00-model_states.pt... 0: [2022-11-28 00:13:13,709] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/layer_36-model_00-model_states.pt. 0: [2022-11-28 00:13:13,710] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step40000/mp_rank_00_model_states.pt 0: [2022-11-28 00:13:13,711] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/mp_rank_00_model_states.pt... 0: [2022-11-28 00:13:13,716] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/mp_rank_00_model_states.pt. 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 4: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 6: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 1: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 2: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 5: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 3: [2022-11-28 00:13:13,739] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step40000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 0: [2022-11-28 00:13:14,294] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-28 00:13:14,294] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,294] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,295] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-28 00:13:14,295] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,295] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,295] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-28 00:13:14,295] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,295] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,327] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-28 00:13:14,327] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,327] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,327] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-28 00:13:14,328] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,328] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,340] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-28 00:13:14,340] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,340] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,340] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-28 00:13:14,340] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,340] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,340] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,359] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,359] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,359] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 4: [2022-11-28 00:13:14,360] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,360] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,360] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 4: [2022-11-28 00:13:14,360] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,360] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,361] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 4: [2022-11-28 00:13:14,361] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,361] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,361] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 4: [2022-11-28 00:13:14,434] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,435] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,435] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 4: [2022-11-28 00:13:14,435] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,435] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,435] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 4: [2022-11-28 00:13:14,435] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,435] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,436] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 4: [2022-11-28 00:13:14,435] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-28 00:13:14,436] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-28 00:13:14,436] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,444] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,444] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,444] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,444] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,444] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,444] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,444] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,444] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 2: [2022-11-28 00:13:14,447] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-28 00:13:14,447] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-28 00:13:14,447] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,672] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,672] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,672] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,672] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,672] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,672] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,678] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,678] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,678] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,678] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,678] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,678] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 7: [2022-11-28 00:13:14,679] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-28 00:13:14,679] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-28 00:13:14,679] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,830] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,830] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,830] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,830] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,830] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,830] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,830] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,831] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,831] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,846] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,846] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,846] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 1: [2022-11-28 00:13:14,847] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-28 00:13:14,847] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-28 00:13:14,847] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: [2022-11-28 00:13:14,890] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-28 00:13:14,890] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,993] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,993] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,993] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,993] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,993] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-28 00:13:14,994] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,994] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,994] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,994] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 5: [2022-11-28 00:13:14,994] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,009] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,009] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,009] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,009] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,009] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,009] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,023] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,023] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,023] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,023] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,024] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 3: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-28 00:13:15,024] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,024] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,024] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,024] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,024] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,025] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,025] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-28 00:13:15,025] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,025] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,025] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step40000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 6: [2022-11-28 00:13:15,025] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step40000 is ready now! 0: successfully saved checkpoint at iteration 40000 to checkpoints_2b2 7: time (ms) | save-checkpoint: 6326.06 7: iteration 40010/ 44073 | consumed samples: 20485120 | consumed tokens: 41953525760 | elapsed time per iteration (s): 4.96 | learning rate: 2.382E-05 | global batch size: 512 | lm loss: 1.916958E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 103.227 | TFLOPs: 48.11 | 7: iteration 40020/ 44073 | consumed samples: 20490240 | consumed tokens: 41964011520 | elapsed time per iteration (s): 4.21 | learning rate: 2.381E-05 | global batch size: 512 | lm loss: 1.933493E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.645 | TFLOPs: 56.69 | 7: iteration 40030/ 44073 | consumed samples: 20495360 | consumed tokens: 41974497280 | elapsed time per iteration (s): 4.19 | learning rate: 2.379E-05 | global batch size: 512 | lm loss: 1.919656E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.222 | TFLOPs: 56.96 | 7: iteration 40040/ 44073 | consumed samples: 20500480 | consumed tokens: 41984983040 | elapsed time per iteration (s): 4.22 | learning rate: 2.377E-05 | global batch size: 512 | lm loss: 1.923204E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.303 | TFLOPs: 56.53 | 7: iteration 40050/ 44073 | consumed samples: 20505600 | consumed tokens: 41995468800 | elapsed time per iteration (s): 4.16 | learning rate: 2.375E-05 | global batch size: 512 | lm loss: 1.919870E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.032 | TFLOPs: 57.34 | 7: iteration 40060/ 44073 | consumed samples: 20510720 | consumed tokens: 42005954560 | elapsed time per iteration (s): 4.23 | learning rate: 2.373E-05 | global batch size: 512 | lm loss: 1.926666E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.911 | TFLOPs: 56.35 | 7: iteration 40070/ 44073 | consumed samples: 20515840 | consumed tokens: 42016440320 | elapsed time per iteration (s): 4.19 | learning rate: 2.371E-05 | global batch size: 512 | lm loss: 1.920619E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.050 | TFLOPs: 56.88 | 7: iteration 40080/ 44073 | consumed samples: 20520960 | consumed tokens: 42026926080 | elapsed time per iteration (s): 4.24 | learning rate: 2.370E-05 | global batch size: 512 | lm loss: 1.916684E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.878 | TFLOPs: 56.34 | 7: iteration 40090/ 44073 | consumed samples: 20526080 | consumed tokens: 42037411840 | elapsed time per iteration (s): 4.25 | learning rate: 2.368E-05 | global batch size: 512 | lm loss: 1.921488E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.483 | TFLOPs: 56.15 | 7: iteration 40100/ 44073 | consumed samples: 20531200 | consumed tokens: 42047897600 | elapsed time per iteration (s): 4.18 | learning rate: 2.366E-05 | global batch size: 512 | lm loss: 1.915179E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.361 | TFLOPs: 57.03 | 7: iteration 40110/ 44073 | consumed samples: 20536320 | consumed tokens: 42058383360 | elapsed time per iteration (s): 4.25 | learning rate: 2.364E-05 | global batch size: 512 | lm loss: 1.916899E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.330 | TFLOPs: 56.08 | 7: iteration 40120/ 44073 | consumed samples: 20541440 | consumed tokens: 42068869120 | elapsed time per iteration (s): 4.22 | learning rate: 2.362E-05 | global batch size: 512 | lm loss: 1.907224E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.385 | TFLOPs: 56.57 | 7: iteration 40130/ 44073 | consumed samples: 20546560 | consumed tokens: 42079354880 | elapsed time per iteration (s): 4.23 | learning rate: 2.360E-05 | global batch size: 512 | lm loss: 1.933221E+00 | grad norm: 0.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.084 | TFLOPs: 56.43 | 7: iteration 40140/ 44073 | consumed samples: 20551680 | consumed tokens: 42089840640 | elapsed time per iteration (s): 4.27 | learning rate: 2.359E-05 | global batch size: 512 | lm loss: 1.913892E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.000 | TFLOPs: 55.93 | 7: iteration 40150/ 44073 | consumed samples: 20556800 | consumed tokens: 42100326400 | elapsed time per iteration (s): 4.24 | learning rate: 2.357E-05 | global batch size: 512 | lm loss: 1.932876E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.721 | TFLOPs: 56.26 | 7: iteration 40160/ 44073 | consumed samples: 20561920 | consumed tokens: 42110812160 | elapsed time per iteration (s): 4.21 | learning rate: 2.355E-05 | global batch size: 512 | lm loss: 1.912046E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.502 | TFLOPs: 56.63 | 7: iteration 40170/ 44073 | consumed samples: 20567040 | consumed tokens: 42121297920 | elapsed time per iteration (s): 4.22 | learning rate: 2.353E-05 | global batch size: 512 | lm loss: 1.907865E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.213 | TFLOPs: 56.49 | 7: iteration 40180/ 44073 | consumed samples: 20572160 | consumed tokens: 42131783680 | elapsed time per iteration (s): 4.25 | learning rate: 2.351E-05 | global batch size: 512 | lm loss: 1.931091E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.330 | TFLOPs: 56.08 | 7: iteration 40190/ 44073 | consumed samples: 20577280 | consumed tokens: 42142269440 | elapsed time per iteration (s): 4.26 | learning rate: 2.350E-05 | global batch size: 512 | lm loss: 1.922132E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.302 | TFLOPs: 56.07 | 7: iteration 40200/ 44073 | consumed samples: 20582400 | consumed tokens: 42152755200 | elapsed time per iteration (s): 4.20 | learning rate: 2.348E-05 | global batch size: 512 | lm loss: 1.909679E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.971 | TFLOPs: 56.84 | 7: iteration 40210/ 44073 | consumed samples: 20587520 | consumed tokens: 42163240960 | elapsed time per iteration (s): 4.19 | learning rate: 2.346E-05 | global batch size: 512 | lm loss: 1.922025E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.064 | TFLOPs: 56.89 | 7: iteration 40220/ 44073 | consumed samples: 20592640 | consumed tokens: 42173726720 | elapsed time per iteration (s): 4.15 | learning rate: 2.344E-05 | global batch size: 512 | lm loss: 1.930050E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.498 | TFLOPs: 57.56 | 7: iteration 40230/ 44073 | consumed samples: 20597760 | consumed tokens: 42184212480 | elapsed time per iteration (s): 4.21 | learning rate: 2.342E-05 | global batch size: 512 | lm loss: 1.902270E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.753 | TFLOPs: 56.74 | 7: iteration 40240/ 44073 | consumed samples: 20602880 | consumed tokens: 42194698240 | elapsed time per iteration (s): 4.23 | learning rate: 2.341E-05 | global batch size: 512 | lm loss: 1.908731E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.062 | TFLOPs: 56.42 | 7: iteration 40250/ 44073 | consumed samples: 20608000 | consumed tokens: 42205184000 | elapsed time per iteration (s): 4.20 | learning rate: 2.339E-05 | global batch size: 512 | lm loss: 1.905111E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.996 | TFLOPs: 56.86 | 7: iteration 40260/ 44073 | consumed samples: 20613120 | consumed tokens: 42215669760 | elapsed time per iteration (s): 4.20 | learning rate: 2.337E-05 | global batch size: 512 | lm loss: 1.934224E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.779 | TFLOPs: 56.76 | 7: iteration 40270/ 44073 | consumed samples: 20618240 | consumed tokens: 42226155520 | elapsed time per iteration (s): 4.24 | learning rate: 2.335E-05 | global batch size: 512 | lm loss: 1.912490E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.775 | TFLOPs: 56.29 | 7: iteration 40280/ 44073 | consumed samples: 20623360 | consumed tokens: 42236641280 | elapsed time per iteration (s): 4.22 | learning rate: 2.334E-05 | global batch size: 512 | lm loss: 1.930848E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.216 | TFLOPs: 56.49 | 7: iteration 40290/ 44073 | consumed samples: 20628480 | consumed tokens: 42247127040 | elapsed time per iteration (s): 4.16 | learning rate: 2.332E-05 | global batch size: 512 | lm loss: 1.935540E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.008 | TFLOPs: 57.33 | 7: iteration 40300/ 44073 | consumed samples: 20633600 | consumed tokens: 42257612800 | elapsed time per iteration (s): 4.17 | learning rate: 2.330E-05 | global batch size: 512 | lm loss: 1.936311E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.798 | TFLOPs: 57.23 | 7: iteration 40310/ 44073 | consumed samples: 20638720 | consumed tokens: 42268098560 | elapsed time per iteration (s): 4.18 | learning rate: 2.328E-05 | global batch size: 512 | lm loss: 1.917695E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.600 | TFLOPs: 57.14 | 7: iteration 40320/ 44073 | consumed samples: 20643840 | consumed tokens: 42278584320 | elapsed time per iteration (s): 4.21 | learning rate: 2.327E-05 | global batch size: 512 | lm loss: 1.928238E+00 | grad norm: 0.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.732 | TFLOPs: 56.73 | 7: iteration 40330/ 44073 | consumed samples: 20648960 | consumed tokens: 42289070080 | elapsed time per iteration (s): 4.23 | learning rate: 2.325E-05 | global batch size: 512 | lm loss: 1.928887E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.134 | TFLOPs: 56.45 | 7: iteration 40340/ 44073 | consumed samples: 20654080 | consumed tokens: 42299555840 | elapsed time per iteration (s): 4.20 | learning rate: 2.323E-05 | global batch size: 512 | lm loss: 1.924447E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.045 | TFLOPs: 56.88 | 7: iteration 40350/ 44073 | consumed samples: 20659200 | consumed tokens: 42310041600 | elapsed time per iteration (s): 4.20 | learning rate: 2.322E-05 | global batch size: 512 | lm loss: 1.947551E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.780 | TFLOPs: 56.76 | 7: iteration 40360/ 44073 | consumed samples: 20664320 | consumed tokens: 42320527360 | elapsed time per iteration (s): 4.19 | learning rate: 2.320E-05 | global batch size: 512 | lm loss: 1.929667E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.236 | TFLOPs: 56.97 | 7: iteration 40370/ 44073 | consumed samples: 20669440 | consumed tokens: 42331013120 | elapsed time per iteration (s): 4.24 | learning rate: 2.318E-05 | global batch size: 512 | lm loss: 1.935144E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.784 | TFLOPs: 56.29 | 7: iteration 40380/ 44073 | consumed samples: 20674560 | consumed tokens: 42341498880 | elapsed time per iteration (s): 4.21 | learning rate: 2.316E-05 | global batch size: 512 | lm loss: 1.922317E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.583 | TFLOPs: 56.66 | 7: iteration 40390/ 44073 | consumed samples: 20679680 | consumed tokens: 42351984640 | elapsed time per iteration (s): 4.22 | learning rate: 2.315E-05 | global batch size: 512 | lm loss: 1.940937E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.318 | TFLOPs: 56.54 | 7: iteration 40400/ 44073 | consumed samples: 20684800 | consumed tokens: 42362470400 | elapsed time per iteration (s): 4.20 | learning rate: 2.313E-05 | global batch size: 512 | lm loss: 1.898982E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.819 | TFLOPs: 56.77 | 7: iteration 40410/ 44073 | consumed samples: 20689920 | consumed tokens: 42372956160 | elapsed time per iteration (s): 4.25 | learning rate: 2.311E-05 | global batch size: 512 | lm loss: 1.908661E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.520 | TFLOPs: 56.17 | 7: iteration 40420/ 44073 | consumed samples: 20695040 | consumed tokens: 42383441920 | elapsed time per iteration (s): 4.26 | learning rate: 2.310E-05 | global batch size: 512 | lm loss: 1.903553E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.097 | TFLOPs: 55.97 | 7: iteration 40430/ 44073 | consumed samples: 20700160 | consumed tokens: 42393927680 | elapsed time per iteration (s): 4.19 | learning rate: 2.308E-05 | global batch size: 512 | lm loss: 1.933197E+00 | grad norm: 0.116 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.105 | TFLOPs: 56.91 | 7: iteration 40440/ 44073 | consumed samples: 20705280 | consumed tokens: 42404413440 | elapsed time per iteration (s): 4.19 | learning rate: 2.306E-05 | global batch size: 512 | lm loss: 1.926963E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.272 | TFLOPs: 56.98 | 7: iteration 40450/ 44073 | consumed samples: 20710400 | consumed tokens: 42414899200 | elapsed time per iteration (s): 4.20 | learning rate: 2.305E-05 | global batch size: 512 | lm loss: 1.914668E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.867 | TFLOPs: 56.80 | 7: iteration 40460/ 44073 | consumed samples: 20715520 | consumed tokens: 42425384960 | elapsed time per iteration (s): 4.23 | learning rate: 2.303E-05 | global batch size: 512 | lm loss: 1.926639E+00 | grad norm: 0.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.925 | TFLOPs: 56.36 | 7: iteration 40470/ 44073 | consumed samples: 20720640 | consumed tokens: 42435870720 | elapsed time per iteration (s): 4.26 | learning rate: 2.301E-05 | global batch size: 512 | lm loss: 1.927629E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.077 | TFLOPs: 55.96 | 7: iteration 40480/ 44073 | consumed samples: 20725760 | consumed tokens: 42446356480 | elapsed time per iteration (s): 4.23 | learning rate: 2.300E-05 | global batch size: 512 | lm loss: 1.914387E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.903 | TFLOPs: 56.35 | 7: iteration 40490/ 44073 | consumed samples: 20730880 | consumed tokens: 42456842240 | elapsed time per iteration (s): 4.31 | learning rate: 2.298E-05 | global batch size: 512 | lm loss: 1.907886E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.726 | TFLOPs: 55.33 | 7: iteration 40500/ 44073 | consumed samples: 20736000 | consumed tokens: 42467328000 | elapsed time per iteration (s): 4.21 | learning rate: 2.296E-05 | global batch size: 512 | lm loss: 1.934468E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.554 | TFLOPs: 56.65 | 7: iteration 40510/ 44073 | consumed samples: 20741120 | consumed tokens: 42477813760 | elapsed time per iteration (s): 4.20 | learning rate: 2.295E-05 | global batch size: 512 | lm loss: 1.923375E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.978 | TFLOPs: 56.85 | 7: iteration 40520/ 44073 | consumed samples: 20746240 | consumed tokens: 42488299520 | elapsed time per iteration (s): 4.24 | learning rate: 2.293E-05 | global batch size: 512 | lm loss: 1.909393E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.679 | TFLOPs: 56.24 | 7: iteration 40530/ 44073 | consumed samples: 20751360 | consumed tokens: 42498785280 | elapsed time per iteration (s): 4.20 | learning rate: 2.291E-05 | global batch size: 512 | lm loss: 1.898680E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.966 | TFLOPs: 56.84 | 7: iteration 40540/ 44073 | consumed samples: 20756480 | consumed tokens: 42509271040 | elapsed time per iteration (s): 4.23 | learning rate: 2.290E-05 | global batch size: 512 | lm loss: 1.929980E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.955 | TFLOPs: 56.37 | 7: iteration 40550/ 44073 | consumed samples: 20761600 | consumed tokens: 42519756800 | elapsed time per iteration (s): 4.20 | learning rate: 2.288E-05 | global batch size: 512 | lm loss: 1.920177E+00 | grad norm: 0.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.765 | TFLOPs: 56.75 | 7: iteration 40560/ 44073 | consumed samples: 20766720 | consumed tokens: 42530242560 | elapsed time per iteration (s): 4.20 | learning rate: 2.286E-05 | global batch size: 512 | lm loss: 1.902530E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.868 | TFLOPs: 56.80 | 7: iteration 40570/ 44073 | consumed samples: 20771840 | consumed tokens: 42540728320 | elapsed time per iteration (s): 4.18 | learning rate: 2.285E-05 | global batch size: 512 | lm loss: 1.936572E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.410 | TFLOPs: 57.05 | 7: iteration 40580/ 44073 | consumed samples: 20776960 | consumed tokens: 42551214080 | elapsed time per iteration (s): 4.19 | learning rate: 2.283E-05 | global batch size: 512 | lm loss: 1.929663E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.185 | TFLOPs: 56.94 | 7: iteration 40590/ 44073 | consumed samples: 20782080 | consumed tokens: 42561699840 | elapsed time per iteration (s): 4.20 | learning rate: 2.282E-05 | global batch size: 512 | lm loss: 1.907334E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.885 | TFLOPs: 56.80 | 7: iteration 40600/ 44073 | consumed samples: 20787200 | consumed tokens: 42572185600 | elapsed time per iteration (s): 4.18 | learning rate: 2.280E-05 | global batch size: 512 | lm loss: 1.923109E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.429 | TFLOPs: 57.06 | 7: iteration 40610/ 44073 | consumed samples: 20792320 | consumed tokens: 42582671360 | elapsed time per iteration (s): 4.21 | learning rate: 2.278E-05 | global batch size: 512 | lm loss: 1.931099E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.752 | TFLOPs: 56.74 | 7: iteration 40620/ 44073 | consumed samples: 20797440 | consumed tokens: 42593157120 | elapsed time per iteration (s): 4.21 | learning rate: 2.277E-05 | global batch size: 512 | lm loss: 1.923664E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.574 | TFLOPs: 56.66 | 7: iteration 40630/ 44073 | consumed samples: 20802560 | consumed tokens: 42603642880 | elapsed time per iteration (s): 4.21 | learning rate: 2.275E-05 | global batch size: 512 | lm loss: 1.921152E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.604 | TFLOPs: 56.67 | 7: iteration 40640/ 44073 | consumed samples: 20807680 | consumed tokens: 42614128640 | elapsed time per iteration (s): 4.22 | learning rate: 2.274E-05 | global batch size: 512 | lm loss: 1.934908E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.406 | TFLOPs: 56.58 | 7: iteration 40650/ 44073 | consumed samples: 20812800 | consumed tokens: 42624614400 | elapsed time per iteration (s): 4.21 | learning rate: 2.272E-05 | global batch size: 512 | lm loss: 1.909190E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.608 | TFLOPs: 56.68 | 7: iteration 40660/ 44073 | consumed samples: 20817920 | consumed tokens: 42635100160 | elapsed time per iteration (s): 4.23 | learning rate: 2.270E-05 | global batch size: 512 | lm loss: 1.926254E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.140 | TFLOPs: 56.46 | 7: iteration 40670/ 44073 | consumed samples: 20823040 | consumed tokens: 42645585920 | elapsed time per iteration (s): 4.18 | learning rate: 2.269E-05 | global batch size: 512 | lm loss: 1.927401E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.427 | TFLOPs: 57.06 | 7: iteration 40680/ 44073 | consumed samples: 20828160 | consumed tokens: 42656071680 | elapsed time per iteration (s): 4.24 | learning rate: 2.267E-05 | global batch size: 512 | lm loss: 1.898869E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.855 | TFLOPs: 56.32 | 7: iteration 40690/ 44073 | consumed samples: 20833280 | consumed tokens: 42666557440 | elapsed time per iteration (s): 4.36 | learning rate: 2.266E-05 | global batch size: 512 | lm loss: 1.917680E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 117.506 | TFLOPs: 54.76 | 7: iteration 40700/ 44073 | consumed samples: 20838400 | consumed tokens: 42677043200 | elapsed time per iteration (s): 4.21 | learning rate: 2.264E-05 | global batch size: 512 | lm loss: 1.915834E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.491 | TFLOPs: 56.62 | 7: iteration 40710/ 44073 | consumed samples: 20843520 | consumed tokens: 42687528960 | elapsed time per iteration (s): 4.21 | learning rate: 2.263E-05 | global batch size: 512 | lm loss: 1.910877E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.515 | TFLOPs: 56.63 | 7: iteration 40720/ 44073 | consumed samples: 20848640 | consumed tokens: 42698014720 | elapsed time per iteration (s): 4.21 | learning rate: 2.261E-05 | global batch size: 512 | lm loss: 1.929227E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.669 | TFLOPs: 56.70 | 7: iteration 40730/ 44073 | consumed samples: 20853760 | consumed tokens: 42708500480 | elapsed time per iteration (s): 4.21 | learning rate: 2.260E-05 | global batch size: 512 | lm loss: 1.896841E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.616 | TFLOPs: 56.68 | 7: iteration 40740/ 44073 | consumed samples: 20858880 | consumed tokens: 42718986240 | elapsed time per iteration (s): 4.20 | learning rate: 2.258E-05 | global batch size: 512 | lm loss: 1.927497E+00 | grad norm: 0.148 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.980 | TFLOPs: 56.85 | 7: iteration 40750/ 44073 | consumed samples: 20864000 | consumed tokens: 42729472000 | elapsed time per iteration (s): 4.21 | learning rate: 2.256E-05 | global batch size: 512 | lm loss: 1.928317E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.599 | TFLOPs: 56.67 | 7: iteration 40760/ 44073 | consumed samples: 20869120 | consumed tokens: 42739957760 | elapsed time per iteration (s): 4.18 | learning rate: 2.255E-05 | global batch size: 512 | lm loss: 1.925746E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.526 | TFLOPs: 57.10 | 7: iteration 40770/ 44073 | consumed samples: 20874240 | consumed tokens: 42750443520 | elapsed time per iteration (s): 4.19 | learning rate: 2.253E-05 | global batch size: 512 | lm loss: 1.925806E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.076 | TFLOPs: 56.89 | 7: iteration 40780/ 44073 | consumed samples: 20879360 | consumed tokens: 42760929280 | elapsed time per iteration (s): 4.20 | learning rate: 2.252E-05 | global batch size: 512 | lm loss: 1.927237E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.977 | TFLOPs: 56.85 | 7: iteration 40790/ 44073 | consumed samples: 20884480 | consumed tokens: 42771415040 | elapsed time per iteration (s): 4.27 | learning rate: 2.250E-05 | global batch size: 512 | lm loss: 1.910635E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.818 | TFLOPs: 55.84 | 7: iteration 40800/ 44073 | consumed samples: 20889600 | consumed tokens: 42781900800 | elapsed time per iteration (s): 4.17 | learning rate: 2.249E-05 | global batch size: 512 | lm loss: 1.924228E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.653 | TFLOPs: 57.16 | 7: iteration 40810/ 44073 | consumed samples: 20894720 | consumed tokens: 42792386560 | elapsed time per iteration (s): 4.17 | learning rate: 2.247E-05 | global batch size: 512 | lm loss: 1.914453E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.899 | TFLOPs: 57.28 | 7: iteration 40820/ 44073 | consumed samples: 20899840 | consumed tokens: 42802872320 | elapsed time per iteration (s): 4.25 | learning rate: 2.246E-05 | global batch size: 512 | lm loss: 1.920755E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.581 | TFLOPs: 56.20 | 7: iteration 40830/ 44073 | consumed samples: 20904960 | consumed tokens: 42813358080 | elapsed time per iteration (s): 4.24 | learning rate: 2.244E-05 | global batch size: 512 | lm loss: 1.916104E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.653 | TFLOPs: 56.23 | 7: iteration 40840/ 44073 | consumed samples: 20910080 | consumed tokens: 42823843840 | elapsed time per iteration (s): 4.22 | learning rate: 2.243E-05 | global batch size: 512 | lm loss: 1.933370E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.428 | TFLOPs: 56.59 | 7: iteration 40850/ 44073 | consumed samples: 20915200 | consumed tokens: 42834329600 | elapsed time per iteration (s): 4.23 | learning rate: 2.241E-05 | global batch size: 512 | lm loss: 1.913699E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.951 | TFLOPs: 56.37 | 7: iteration 40860/ 44073 | consumed samples: 20920320 | consumed tokens: 42844815360 | elapsed time per iteration (s): 4.42 | learning rate: 2.240E-05 | global batch size: 512 | lm loss: 1.939039E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 115.863 | TFLOPs: 54.00 | 7: iteration 40870/ 44073 | consumed samples: 20925440 | consumed tokens: 42855301120 | elapsed time per iteration (s): 4.20 | learning rate: 2.238E-05 | global batch size: 512 | lm loss: 1.929804E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.828 | TFLOPs: 56.78 | 7: iteration 40880/ 44073 | consumed samples: 20930560 | consumed tokens: 42865786880 | elapsed time per iteration (s): 4.22 | learning rate: 2.237E-05 | global batch size: 512 | lm loss: 1.942973E+00 | grad norm: 0.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.430 | TFLOPs: 56.59 | 7: iteration 40890/ 44073 | consumed samples: 20935680 | consumed tokens: 42876272640 | elapsed time per iteration (s): 4.20 | learning rate: 2.235E-05 | global batch size: 512 | lm loss: 1.932936E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.929 | TFLOPs: 56.83 | 7: iteration 40900/ 44073 | consumed samples: 20940800 | consumed tokens: 42886758400 | elapsed time per iteration (s): 4.20 | learning rate: 2.234E-05 | global batch size: 512 | lm loss: 1.917991E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.985 | TFLOPs: 56.85 | 7: iteration 40910/ 44073 | consumed samples: 20945920 | consumed tokens: 42897244160 | elapsed time per iteration (s): 4.21 | learning rate: 2.232E-05 | global batch size: 512 | lm loss: 1.917396E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.632 | TFLOPs: 56.69 | 7: iteration 40920/ 44073 | consumed samples: 20951040 | consumed tokens: 42907729920 | elapsed time per iteration (s): 4.21 | learning rate: 2.231E-05 | global batch size: 512 | lm loss: 1.925701E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.677 | TFLOPs: 56.71 | 7: iteration 40930/ 44073 | consumed samples: 20956160 | consumed tokens: 42918215680 | elapsed time per iteration (s): 4.29 | learning rate: 2.230E-05 | global batch size: 512 | lm loss: 1.915659E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.294 | TFLOPs: 55.60 | 7: iteration 40940/ 44073 | consumed samples: 20961280 | consumed tokens: 42928701440 | elapsed time per iteration (s): 4.30 | learning rate: 2.228E-05 | global batch size: 512 | lm loss: 1.915567E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.993 | TFLOPs: 55.46 | 7: iteration 40950/ 44073 | consumed samples: 20966400 | consumed tokens: 42939187200 | elapsed time per iteration (s): 4.20 | learning rate: 2.227E-05 | global batch size: 512 | lm loss: 1.915752E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.934 | TFLOPs: 56.83 | 7: iteration 40960/ 44073 | consumed samples: 20971520 | consumed tokens: 42949672960 | elapsed time per iteration (s): 4.20 | learning rate: 2.225E-05 | global batch size: 512 | lm loss: 1.924018E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.851 | TFLOPs: 56.79 | 7: iteration 40970/ 44073 | consumed samples: 20976640 | consumed tokens: 42960158720 | elapsed time per iteration (s): 4.23 | learning rate: 2.224E-05 | global batch size: 512 | lm loss: 1.914812E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.041 | TFLOPs: 56.41 | 7: iteration 40980/ 44073 | consumed samples: 20981760 | consumed tokens: 42970644480 | elapsed time per iteration (s): 4.24 | learning rate: 2.222E-05 | global batch size: 512 | lm loss: 1.919233E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.687 | TFLOPs: 56.25 | 7: iteration 40990/ 44073 | consumed samples: 20986880 | consumed tokens: 42981130240 | elapsed time per iteration (s): 4.19 | learning rate: 2.221E-05 | global batch size: 512 | lm loss: 1.915971E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.165 | TFLOPs: 56.93 | 7: iteration 41000/ 44073 | consumed samples: 20992000 | consumed tokens: 42991616000 | elapsed time per iteration (s): 4.24 | learning rate: 2.219E-05 | global batch size: 512 | lm loss: 1.907164E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.830 | TFLOPs: 56.31 | 7: ------------------------------------------------------------------------------------------- 7: valid loss at iteration 41000 | lm loss value: 1.917019E+00 | lm loss PPL: 6.800655E+00 | 7: ------------------------------------------------------------------------------------------- 0: saving checkpoint at iteration 41000 to checkpoints_2b2 0: [2022-11-28 01:23:34,729] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step41000 is begin to save! 0: [2022-11-28 01:23:34,759] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_01-model_00-model_states.pt... 0: [2022-11-28 01:23:35,145] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_01-model_00-model_states.pt. 0: [2022-11-28 01:23:35,145] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_03-model_00-model_states.pt... 0: [2022-11-28 01:23:35,286] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_03-model_00-model_states.pt. 0: [2022-11-28 01:23:35,286] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_04-model_00-model_states.pt... 0: [2022-11-28 01:23:35,425] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_04-model_00-model_states.pt. 0: [2022-11-28 01:23:35,425] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_05-model_00-model_states.pt... 0: [2022-11-28 01:23:35,569] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_05-model_00-model_states.pt. 0: [2022-11-28 01:23:35,570] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_06-model_00-model_states.pt... 0: [2022-11-28 01:23:35,710] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_06-model_00-model_states.pt. 0: [2022-11-28 01:23:35,710] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_07-model_00-model_states.pt... 0: [2022-11-28 01:23:35,859] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_07-model_00-model_states.pt. 0: [2022-11-28 01:23:35,860] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_08-model_00-model_states.pt... 0: [2022-11-28 01:23:36,003] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_08-model_00-model_states.pt. 0: [2022-11-28 01:23:36,004] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_09-model_00-model_states.pt... 0: [2022-11-28 01:23:36,146] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_09-model_00-model_states.pt. 0: [2022-11-28 01:23:36,147] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_10-model_00-model_states.pt... 0: [2022-11-28 01:23:36,296] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_10-model_00-model_states.pt. 0: [2022-11-28 01:23:36,296] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_11-model_00-model_states.pt... 0: [2022-11-28 01:23:36,438] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_11-model_00-model_states.pt. 0: [2022-11-28 01:23:36,439] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_12-model_00-model_states.pt... 0: [2022-11-28 01:23:36,576] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_12-model_00-model_states.pt. 0: [2022-11-28 01:23:36,576] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_13-model_00-model_states.pt... 0: [2022-11-28 01:23:36,716] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_13-model_00-model_states.pt. 0: [2022-11-28 01:23:36,716] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_14-model_00-model_states.pt... 0: [2022-11-28 01:23:36,857] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_14-model_00-model_states.pt. 0: [2022-11-28 01:23:36,857] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_15-model_00-model_states.pt... 0: [2022-11-28 01:23:37,003] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_15-model_00-model_states.pt. 0: [2022-11-28 01:23:37,003] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_16-model_00-model_states.pt... 0: [2022-11-28 01:23:37,139] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_16-model_00-model_states.pt. 0: [2022-11-28 01:23:37,140] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_17-model_00-model_states.pt... 0: [2022-11-28 01:23:37,279] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_17-model_00-model_states.pt. 0: [2022-11-28 01:23:37,279] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_18-model_00-model_states.pt... 0: [2022-11-28 01:23:37,413] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_18-model_00-model_states.pt. 0: [2022-11-28 01:23:37,413] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_19-model_00-model_states.pt... 0: [2022-11-28 01:23:37,556] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_19-model_00-model_states.pt. 0: [2022-11-28 01:23:37,556] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_20-model_00-model_states.pt... 0: [2022-11-28 01:23:37,689] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_20-model_00-model_states.pt. 0: [2022-11-28 01:23:37,690] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_21-model_00-model_states.pt... 0: [2022-11-28 01:23:37,832] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_21-model_00-model_states.pt. 0: [2022-11-28 01:23:37,832] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_22-model_00-model_states.pt... 0: [2022-11-28 01:23:37,967] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_22-model_00-model_states.pt. 0: [2022-11-28 01:23:37,967] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_23-model_00-model_states.pt... 0: [2022-11-28 01:23:38,107] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_23-model_00-model_states.pt. 0: [2022-11-28 01:23:38,108] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_24-model_00-model_states.pt... 0: [2022-11-28 01:23:38,244] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_24-model_00-model_states.pt. 0: [2022-11-28 01:23:38,244] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_25-model_00-model_states.pt... 0: [2022-11-28 01:23:38,379] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_25-model_00-model_states.pt. 0: [2022-11-28 01:23:38,379] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_26-model_00-model_states.pt... 0: [2022-11-28 01:23:38,521] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_26-model_00-model_states.pt. 0: [2022-11-28 01:23:38,522] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_27-model_00-model_states.pt... 0: [2022-11-28 01:23:38,656] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_27-model_00-model_states.pt. 0: [2022-11-28 01:23:38,657] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_28-model_00-model_states.pt... 0: [2022-11-28 01:23:38,796] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_28-model_00-model_states.pt. 0: [2022-11-28 01:23:38,796] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_29-model_00-model_states.pt... 0: [2022-11-28 01:23:38,934] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_29-model_00-model_states.pt. 0: [2022-11-28 01:23:38,934] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_30-model_00-model_states.pt... 0: [2022-11-28 01:23:39,070] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_30-model_00-model_states.pt. 0: [2022-11-28 01:23:39,070] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_31-model_00-model_states.pt... 0: [2022-11-28 01:23:39,212] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_31-model_00-model_states.pt. 0: [2022-11-28 01:23:39,213] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_32-model_00-model_states.pt... 0: [2022-11-28 01:23:39,349] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_32-model_00-model_states.pt. 0: [2022-11-28 01:23:39,349] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_33-model_00-model_states.pt... 0: [2022-11-28 01:23:39,485] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_33-model_00-model_states.pt. 0: [2022-11-28 01:23:39,486] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_34-model_00-model_states.pt... 0: [2022-11-28 01:23:39,623] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_34-model_00-model_states.pt. 0: [2022-11-28 01:23:39,623] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/layer_36-model_00-model_states.pt... 0: [2022-11-28 01:23:39,628] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/layer_36-model_00-model_states.pt. 0: [2022-11-28 01:23:39,630] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step41000/mp_rank_00_model_states.pt 0: [2022-11-28 01:23:39,630] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/mp_rank_00_model_states.pt... 0: [2022-11-28 01:23:39,635] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/mp_rank_00_model_states.pt. 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 4: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 6: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 3: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 2: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 5: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 1: [2022-11-28 01:23:39,655] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step41000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 0: [2022-11-28 01:23:40,178] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-28 01:23:40,178] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,178] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,199] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-28 01:23:40,199] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,199] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,203] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-28 01:23:40,203] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,203] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,212] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-28 01:23:40,212] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,212] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,214] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-28 01:23:40,214] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,214] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,224] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-28 01:23:40,224] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,224] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,231] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,232] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,232] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 5: [2022-11-28 01:23:40,231] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,231] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,231] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 5: [2022-11-28 01:23:40,236] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,236] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,236] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,236] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 5: [2022-11-28 01:23:40,236] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,236] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,237] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,237] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,237] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,237] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,237] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,237] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,256] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,256] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,256] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,257] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,257] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,257] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,258] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,258] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,258] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,258] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,258] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,258] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,258] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,258] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,258] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,268] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,271] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,271] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,271] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,268] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,268] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,269] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,269] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,269] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,273] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,273] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,274] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,277] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,277] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,277] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,277] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,277] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,277] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,278] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,278] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,278] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,286] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-28 01:23:40,286] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 2: [2022-11-28 01:23:40,286] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-28 01:23:40,286] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,286] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 2: [2022-11-28 01:23:40,286] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 4: [2022-11-28 01:23:40,290] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-28 01:23:40,290] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-28 01:23:40,290] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 5: [2022-11-28 01:23:40,294] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,294] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,294] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 5: [2022-11-28 01:23:40,306] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,306] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,306] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 5: [2022-11-28 01:23:40,318] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,318] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,318] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 5: [2022-11-28 01:23:40,318] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-28 01:23:40,318] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 5: [2022-11-28 01:23:40,318] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: [2022-11-28 01:23:40,375] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-28 01:23:40,375] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,538] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,538] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,538] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,538] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,538] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,538] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,539] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,539] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,539] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-28 01:23:40,539] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,539] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,539] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-28 01:23:40,539] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,539] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 7: [2022-11-28 01:23:40,539] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,542] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,542] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,542] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,542] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,542] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,542] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,542] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,556] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,557] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,557] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 6: [2022-11-28 01:23:40,557] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-28 01:23:40,557] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-28 01:23:40,557] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,606] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,606] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,606] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,606] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,606] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,606] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,606] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,607] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,607] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,607] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 1: [2022-11-28 01:23:40,607] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-28 01:23:40,607] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-28 01:23:40,607] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step41000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 3: [2022-11-28 01:23:40,977] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step41000 is ready now! 0: successfully saved checkpoint at iteration 41000 to checkpoints_2b2 7: time (ms) | save-checkpoint: 6278.64 7: iteration 41010/ 44073 | consumed samples: 20997120 | consumed tokens: 43002101760 | elapsed time per iteration (s): 4.99 | learning rate: 2.218E-05 | global batch size: 512 | lm loss: 1.941381E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 102.644 | TFLOPs: 47.84 | 7: iteration 41020/ 44073 | consumed samples: 21002240 | consumed tokens: 43012587520 | elapsed time per iteration (s): 4.15 | learning rate: 2.217E-05 | global batch size: 512 | lm loss: 1.918494E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.437 | TFLOPs: 57.53 | 7: iteration 41030/ 44073 | consumed samples: 21007360 | consumed tokens: 43023073280 | elapsed time per iteration (s): 4.15 | learning rate: 2.215E-05 | global batch size: 512 | lm loss: 1.941606E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.475 | TFLOPs: 57.55 | 7: iteration 41040/ 44073 | consumed samples: 21012480 | consumed tokens: 43033559040 | elapsed time per iteration (s): 4.26 | learning rate: 2.214E-05 | global batch size: 512 | lm loss: 1.912734E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.065 | TFLOPs: 55.96 | 7: iteration 41050/ 44073 | consumed samples: 21017600 | consumed tokens: 43044044800 | elapsed time per iteration (s): 4.16 | learning rate: 2.212E-05 | global batch size: 512 | lm loss: 1.918079E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.968 | TFLOPs: 57.31 | 7: iteration 41060/ 44073 | consumed samples: 21022720 | consumed tokens: 43054530560 | elapsed time per iteration (s): 4.17 | learning rate: 2.211E-05 | global batch size: 512 | lm loss: 1.920060E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.652 | TFLOPs: 57.16 | 7: iteration 41070/ 44073 | consumed samples: 21027840 | consumed tokens: 43065016320 | elapsed time per iteration (s): 4.21 | learning rate: 2.210E-05 | global batch size: 512 | lm loss: 1.918868E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.719 | TFLOPs: 56.73 | 7: iteration 41080/ 44073 | consumed samples: 21032960 | consumed tokens: 43075502080 | elapsed time per iteration (s): 4.23 | learning rate: 2.208E-05 | global batch size: 512 | lm loss: 1.941554E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.086 | TFLOPs: 56.43 | 7: iteration 41090/ 44073 | consumed samples: 21038080 | consumed tokens: 43085987840 | elapsed time per iteration (s): 4.22 | learning rate: 2.207E-05 | global batch size: 512 | lm loss: 1.932952E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.460 | TFLOPs: 56.61 | 7: iteration 41100/ 44073 | consumed samples: 21043200 | consumed tokens: 43096473600 | elapsed time per iteration (s): 4.20 | learning rate: 2.205E-05 | global batch size: 512 | lm loss: 1.933557E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.873 | TFLOPs: 56.80 | 7: iteration 41110/ 44073 | consumed samples: 21048320 | consumed tokens: 43106959360 | elapsed time per iteration (s): 4.19 | learning rate: 2.204E-05 | global batch size: 512 | lm loss: 1.920932E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.332 | TFLOPs: 57.01 | 7: iteration 41120/ 44073 | consumed samples: 21053440 | consumed tokens: 43117445120 | elapsed time per iteration (s): 4.20 | learning rate: 2.203E-05 | global batch size: 512 | lm loss: 1.917191E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.977 | TFLOPs: 56.85 | 7: iteration 41130/ 44073 | consumed samples: 21058560 | consumed tokens: 43127930880 | elapsed time per iteration (s): 4.21 | learning rate: 2.201E-05 | global batch size: 512 | lm loss: 1.933830E+00 | grad norm: 0.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.632 | TFLOPs: 56.69 | 7: iteration 41140/ 44073 | consumed samples: 21063680 | consumed tokens: 43138416640 | elapsed time per iteration (s): 4.26 | learning rate: 2.200E-05 | global batch size: 512 | lm loss: 1.940261E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.193 | TFLOPs: 56.02 | 7: iteration 41150/ 44073 | consumed samples: 21068800 | consumed tokens: 43148902400 | elapsed time per iteration (s): 4.17 | learning rate: 2.199E-05 | global batch size: 512 | lm loss: 1.917811E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.811 | TFLOPs: 57.24 | 7: iteration 41160/ 44073 | consumed samples: 21073920 | consumed tokens: 43159388160 | elapsed time per iteration (s): 4.21 | learning rate: 2.197E-05 | global batch size: 512 | lm loss: 1.899308E+00 | grad norm: 0.147 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.722 | TFLOPs: 56.73 | 7: iteration 41170/ 44073 | consumed samples: 21079040 | consumed tokens: 43169873920 | elapsed time per iteration (s): 4.20 | learning rate: 2.196E-05 | global batch size: 512 | lm loss: 1.923462E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.001 | TFLOPs: 56.86 | 7: iteration 41180/ 44073 | consumed samples: 21084160 | consumed tokens: 43180359680 | elapsed time per iteration (s): 4.21 | learning rate: 2.195E-05 | global batch size: 512 | lm loss: 1.907561E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.662 | TFLOPs: 56.70 | 7: iteration 41190/ 44073 | consumed samples: 21089280 | consumed tokens: 43190845440 | elapsed time per iteration (s): 4.20 | learning rate: 2.193E-05 | global batch size: 512 | lm loss: 1.914170E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.891 | TFLOPs: 56.81 | 7: iteration 41200/ 44073 | consumed samples: 21094400 | consumed tokens: 43201331200 | elapsed time per iteration (s): 4.27 | learning rate: 2.192E-05 | global batch size: 512 | lm loss: 1.906932E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.829 | TFLOPs: 55.85 | 7: iteration 41210/ 44073 | consumed samples: 21099520 | consumed tokens: 43211816960 | elapsed time per iteration (s): 4.24 | learning rate: 2.191E-05 | global batch size: 512 | lm loss: 1.908564E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.648 | TFLOPs: 56.23 | 7: iteration 41220/ 44073 | consumed samples: 21104640 | consumed tokens: 43222302720 | elapsed time per iteration (s): 4.27 | learning rate: 2.189E-05 | global batch size: 512 | lm loss: 1.922137E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.963 | TFLOPs: 55.91 | 7: iteration 41230/ 44073 | consumed samples: 21109760 | consumed tokens: 43232788480 | elapsed time per iteration (s): 4.21 | learning rate: 2.188E-05 | global batch size: 512 | lm loss: 1.935409E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.712 | TFLOPs: 56.72 | 7: iteration 41240/ 44073 | consumed samples: 21114880 | consumed tokens: 43243274240 | elapsed time per iteration (s): 4.24 | learning rate: 2.187E-05 | global batch size: 512 | lm loss: 1.899398E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.879 | TFLOPs: 56.34 | 7: iteration 41250/ 44073 | consumed samples: 21120000 | consumed tokens: 43253760000 | elapsed time per iteration (s): 4.18 | learning rate: 2.185E-05 | global batch size: 512 | lm loss: 1.924329E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.540 | TFLOPs: 57.11 | 7: iteration 41260/ 44073 | consumed samples: 21125120 | consumed tokens: 43264245760 | elapsed time per iteration (s): 4.20 | learning rate: 2.184E-05 | global batch size: 512 | lm loss: 1.919070E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.828 | TFLOPs: 56.78 | 7: iteration 41270/ 44073 | consumed samples: 21130240 | consumed tokens: 43274731520 | elapsed time per iteration (s): 4.18 | learning rate: 2.183E-05 | global batch size: 512 | lm loss: 1.923631E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.386 | TFLOPs: 57.04 | 7: iteration 41280/ 44073 | consumed samples: 21135360 | consumed tokens: 43285217280 | elapsed time per iteration (s): 4.22 | learning rate: 2.181E-05 | global batch size: 512 | lm loss: 1.928982E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.233 | TFLOPs: 56.50 | 7: iteration 41290/ 44073 | consumed samples: 21140480 | consumed tokens: 43295703040 | elapsed time per iteration (s): 4.23 | learning rate: 2.180E-05 | global batch size: 512 | lm loss: 1.936457E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.101 | TFLOPs: 56.44 | 7: iteration 41300/ 44073 | consumed samples: 21145600 | consumed tokens: 43306188800 | elapsed time per iteration (s): 4.25 | learning rate: 2.179E-05 | global batch size: 512 | lm loss: 1.929680E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.428 | TFLOPs: 56.13 | 7: iteration 41310/ 44073 | consumed samples: 21150720 | consumed tokens: 43316674560 | elapsed time per iteration (s): 4.22 | learning rate: 2.178E-05 | global batch size: 512 | lm loss: 1.918960E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.185 | TFLOPs: 56.48 | 7: iteration 41320/ 44073 | consumed samples: 21155840 | consumed tokens: 43327160320 | elapsed time per iteration (s): 4.19 | learning rate: 2.176E-05 | global batch size: 512 | lm loss: 1.930959E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.081 | TFLOPs: 56.90 | 7: iteration 41330/ 44073 | consumed samples: 21160960 | consumed tokens: 43337646080 | elapsed time per iteration (s): 4.21 | learning rate: 2.175E-05 | global batch size: 512 | lm loss: 1.927836E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.673 | TFLOPs: 56.71 | 7: iteration 41340/ 44073 | consumed samples: 21166080 | consumed tokens: 43348131840 | elapsed time per iteration (s): 4.26 | learning rate: 2.174E-05 | global batch size: 512 | lm loss: 1.909732E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.234 | TFLOPs: 56.04 | 7: iteration 41350/ 44073 | consumed samples: 21171200 | consumed tokens: 43358617600 | elapsed time per iteration (s): 4.24 | learning rate: 2.172E-05 | global batch size: 512 | lm loss: 1.925563E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.760 | TFLOPs: 56.28 | 7: iteration 41360/ 44073 | consumed samples: 21176320 | consumed tokens: 43369103360 | elapsed time per iteration (s): 4.18 | learning rate: 2.171E-05 | global batch size: 512 | lm loss: 1.908219E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.557 | TFLOPs: 57.12 | 7: iteration 41370/ 44073 | consumed samples: 21181440 | consumed tokens: 43379589120 | elapsed time per iteration (s): 4.19 | learning rate: 2.170E-05 | global batch size: 512 | lm loss: 1.925726E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.087 | TFLOPs: 56.90 | 7: iteration 41380/ 44073 | consumed samples: 21186560 | consumed tokens: 43390074880 | elapsed time per iteration (s): 4.17 | learning rate: 2.169E-05 | global batch size: 512 | lm loss: 1.941639E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.888 | TFLOPs: 57.27 | 7: iteration 41390/ 44073 | consumed samples: 21191680 | consumed tokens: 43400560640 | elapsed time per iteration (s): 4.22 | learning rate: 2.167E-05 | global batch size: 512 | lm loss: 1.925296E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.402 | TFLOPs: 56.58 | 7: iteration 41400/ 44073 | consumed samples: 21196800 | consumed tokens: 43411046400 | elapsed time per iteration (s): 4.21 | learning rate: 2.166E-05 | global batch size: 512 | lm loss: 1.938153E+00 | grad norm: 0.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.634 | TFLOPs: 56.69 | 7: iteration 41410/ 44073 | consumed samples: 21201920 | consumed tokens: 43421532160 | elapsed time per iteration (s): 4.23 | learning rate: 2.165E-05 | global batch size: 512 | lm loss: 1.907005E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.026 | TFLOPs: 56.40 | 7: iteration 41420/ 44073 | consumed samples: 21207040 | consumed tokens: 43432017920 | elapsed time per iteration (s): 4.22 | learning rate: 2.164E-05 | global batch size: 512 | lm loss: 1.919101E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.381 | TFLOPs: 56.57 | 7: iteration 41430/ 44073 | consumed samples: 21212160 | consumed tokens: 43442503680 | elapsed time per iteration (s): 4.23 | learning rate: 2.163E-05 | global batch size: 512 | lm loss: 1.920392E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.006 | TFLOPs: 56.40 | 7: iteration 41440/ 44073 | consumed samples: 21217280 | consumed tokens: 43452989440 | elapsed time per iteration (s): 4.20 | learning rate: 2.161E-05 | global batch size: 512 | lm loss: 1.942027E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.920 | TFLOPs: 56.82 | 7: iteration 41450/ 44073 | consumed samples: 21222400 | consumed tokens: 43463475200 | elapsed time per iteration (s): 4.17 | learning rate: 2.160E-05 | global batch size: 512 | lm loss: 1.927256E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.719 | TFLOPs: 57.19 | 7: iteration 41460/ 44073 | consumed samples: 21227520 | consumed tokens: 43473960960 | elapsed time per iteration (s): 4.20 | learning rate: 2.159E-05 | global batch size: 512 | lm loss: 1.927670E+00 | grad norm: 0.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.822 | TFLOPs: 56.78 | 7: iteration 41470/ 44073 | consumed samples: 21232640 | consumed tokens: 43484446720 | elapsed time per iteration (s): 4.13 | learning rate: 2.158E-05 | global batch size: 512 | lm loss: 1.916042E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.826 | TFLOPs: 57.71 | 7: iteration 41480/ 44073 | consumed samples: 21237760 | consumed tokens: 43494932480 | elapsed time per iteration (s): 4.19 | learning rate: 2.156E-05 | global batch size: 512 | lm loss: 1.912910E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.161 | TFLOPs: 56.93 | 7: iteration 41490/ 44073 | consumed samples: 21242880 | consumed tokens: 43505418240 | elapsed time per iteration (s): 4.22 | learning rate: 2.155E-05 | global batch size: 512 | lm loss: 1.932802E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.346 | TFLOPs: 56.55 | 7: iteration 41500/ 44073 | consumed samples: 21248000 | consumed tokens: 43515904000 | elapsed time per iteration (s): 4.23 | learning rate: 2.154E-05 | global batch size: 512 | lm loss: 1.929840E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.950 | TFLOPs: 56.37 | 7: iteration 41510/ 44073 | consumed samples: 21253120 | consumed tokens: 43526389760 | elapsed time per iteration (s): 4.20 | learning rate: 2.153E-05 | global batch size: 512 | lm loss: 1.909773E+00 | grad norm: 0.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.048 | TFLOPs: 56.88 | 7: iteration 41520/ 44073 | consumed samples: 21258240 | consumed tokens: 43536875520 | elapsed time per iteration (s): 4.20 | learning rate: 2.152E-05 | global batch size: 512 | lm loss: 1.904548E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.818 | TFLOPs: 56.77 | 7: iteration 41530/ 44073 | consumed samples: 21263360 | consumed tokens: 43547361280 | elapsed time per iteration (s): 4.25 | learning rate: 2.151E-05 | global batch size: 512 | lm loss: 1.930476E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.588 | TFLOPs: 56.20 | 7: iteration 41540/ 44073 | consumed samples: 21268480 | consumed tokens: 43557847040 | elapsed time per iteration (s): 4.19 | learning rate: 2.149E-05 | global batch size: 512 | lm loss: 1.936941E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.195 | TFLOPs: 56.95 | 7: iteration 41550/ 44073 | consumed samples: 21273600 | consumed tokens: 43568332800 | elapsed time per iteration (s): 4.17 | learning rate: 2.148E-05 | global batch size: 512 | lm loss: 1.918883E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.924 | TFLOPs: 57.29 | 7: iteration 41560/ 44073 | consumed samples: 21278720 | consumed tokens: 43578818560 | elapsed time per iteration (s): 4.24 | learning rate: 2.147E-05 | global batch size: 512 | lm loss: 1.917175E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.785 | TFLOPs: 56.29 | 7: iteration 41570/ 44073 | consumed samples: 21283840 | consumed tokens: 43589304320 | elapsed time per iteration (s): 4.57 | learning rate: 2.146E-05 | global batch size: 512 | lm loss: 1.906841E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 112.146 | TFLOPs: 52.27 | 7: iteration 41580/ 44073 | consumed samples: 21288960 | consumed tokens: 43599790080 | elapsed time per iteration (s): 4.18 | learning rate: 2.145E-05 | global batch size: 512 | lm loss: 1.915516E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.633 | TFLOPs: 57.15 | 7: iteration 41590/ 44073 | consumed samples: 21294080 | consumed tokens: 43610275840 | elapsed time per iteration (s): 4.20 | learning rate: 2.144E-05 | global batch size: 512 | lm loss: 1.949363E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.770 | TFLOPs: 56.75 | 7: iteration 41600/ 44073 | consumed samples: 21299200 | consumed tokens: 43620761600 | elapsed time per iteration (s): 4.23 | learning rate: 2.142E-05 | global batch size: 512 | lm loss: 1.898482E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.068 | TFLOPs: 56.42 | 7: iteration 41610/ 44073 | consumed samples: 21304320 | consumed tokens: 43631247360 | elapsed time per iteration (s): 4.28 | learning rate: 2.141E-05 | global batch size: 512 | lm loss: 1.918267E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.519 | TFLOPs: 55.70 | 7: iteration 41620/ 44073 | consumed samples: 21309440 | consumed tokens: 43641733120 | elapsed time per iteration (s): 4.27 | learning rate: 2.140E-05 | global batch size: 512 | lm loss: 1.921727E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.833 | TFLOPs: 55.85 | 7: iteration 41630/ 44073 | consumed samples: 21314560 | consumed tokens: 43652218880 | elapsed time per iteration (s): 4.20 | learning rate: 2.139E-05 | global batch size: 512 | lm loss: 1.925830E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.802 | TFLOPs: 56.77 | 7: iteration 41640/ 44073 | consumed samples: 21319680 | consumed tokens: 43662704640 | elapsed time per iteration (s): 4.20 | learning rate: 2.138E-05 | global batch size: 512 | lm loss: 1.929045E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.944 | TFLOPs: 56.83 | 7: iteration 41650/ 44073 | consumed samples: 21324800 | consumed tokens: 43673190400 | elapsed time per iteration (s): 4.23 | learning rate: 2.137E-05 | global batch size: 512 | lm loss: 1.936910E+00 | grad norm: 0.134 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.082 | TFLOPs: 56.43 | 7: iteration 41660/ 44073 | consumed samples: 21329920 | consumed tokens: 43683676160 | elapsed time per iteration (s): 4.21 | learning rate: 2.136E-05 | global batch size: 512 | lm loss: 1.915724E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.608 | TFLOPs: 56.68 | 7: iteration 41670/ 44073 | consumed samples: 21335040 | consumed tokens: 43694161920 | elapsed time per iteration (s): 4.18 | learning rate: 2.134E-05 | global batch size: 512 | lm loss: 1.917155E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.351 | TFLOPs: 57.02 | 7: iteration 41680/ 44073 | consumed samples: 21340160 | consumed tokens: 43704647680 | elapsed time per iteration (s): 4.25 | learning rate: 2.133E-05 | global batch size: 512 | lm loss: 1.919868E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.610 | TFLOPs: 56.21 | 7: iteration 41690/ 44073 | consumed samples: 21345280 | consumed tokens: 43715133440 | elapsed time per iteration (s): 4.22 | learning rate: 2.132E-05 | global batch size: 512 | lm loss: 1.929904E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.303 | TFLOPs: 56.53 | 7: iteration 41700/ 44073 | consumed samples: 21350400 | consumed tokens: 43725619200 | elapsed time per iteration (s): 4.24 | learning rate: 2.131E-05 | global batch size: 512 | lm loss: 1.928405E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.779 | TFLOPs: 56.29 | 7: iteration 41710/ 44073 | consumed samples: 21355520 | consumed tokens: 43736104960 | elapsed time per iteration (s): 4.20 | learning rate: 2.130E-05 | global batch size: 512 | lm loss: 1.916163E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.800 | TFLOPs: 56.77 | 7: iteration 41720/ 44073 | consumed samples: 21360640 | consumed tokens: 43746590720 | elapsed time per iteration (s): 4.18 | learning rate: 2.129E-05 | global batch size: 512 | lm loss: 1.927440E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.480 | TFLOPs: 57.08 | 7: iteration 41730/ 44073 | consumed samples: 21365760 | consumed tokens: 43757076480 | elapsed time per iteration (s): 4.18 | learning rate: 2.128E-05 | global batch size: 512 | lm loss: 1.912388E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.499 | TFLOPs: 57.09 | 7: iteration 41740/ 44073 | consumed samples: 21370880 | consumed tokens: 43767562240 | elapsed time per iteration (s): 4.23 | learning rate: 2.127E-05 | global batch size: 512 | lm loss: 1.913725E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.064 | TFLOPs: 56.42 | 7: iteration 41750/ 44073 | consumed samples: 21376000 | consumed tokens: 43778048000 | elapsed time per iteration (s): 4.21 | learning rate: 2.126E-05 | global batch size: 512 | lm loss: 1.907619E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.523 | TFLOPs: 56.64 | 7: iteration 41760/ 44073 | consumed samples: 21381120 | consumed tokens: 43788533760 | elapsed time per iteration (s): 4.21 | learning rate: 2.125E-05 | global batch size: 512 | lm loss: 1.900363E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.605 | TFLOPs: 56.67 | 7: iteration 41770/ 44073 | consumed samples: 21386240 | consumed tokens: 43799019520 | elapsed time per iteration (s): 4.24 | learning rate: 2.124E-05 | global batch size: 512 | lm loss: 1.920601E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.647 | TFLOPs: 56.23 | 7: iteration 41780/ 44073 | consumed samples: 21391360 | consumed tokens: 43809505280 | elapsed time per iteration (s): 4.24 | learning rate: 2.122E-05 | global batch size: 512 | lm loss: 1.909709E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.763 | TFLOPs: 56.28 | 7: iteration 41790/ 44073 | consumed samples: 21396480 | consumed tokens: 43819991040 | elapsed time per iteration (s): 4.15 | learning rate: 2.121E-05 | global batch size: 512 | lm loss: 1.932841E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.441 | TFLOPs: 57.53 | 7: iteration 41800/ 44073 | consumed samples: 21401600 | consumed tokens: 43830476800 | elapsed time per iteration (s): 4.20 | learning rate: 2.120E-05 | global batch size: 512 | lm loss: 1.927528E+00 | grad norm: 0.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.787 | TFLOPs: 56.76 | 7: iteration 41810/ 44073 | consumed samples: 21406720 | consumed tokens: 43840962560 | elapsed time per iteration (s): 4.21 | learning rate: 2.119E-05 | global batch size: 512 | lm loss: 1.916578E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.527 | TFLOPs: 56.64 | 7: iteration 41820/ 44073 | consumed samples: 21411840 | consumed tokens: 43851448320 | elapsed time per iteration (s): 4.24 | learning rate: 2.118E-05 | global batch size: 512 | lm loss: 1.912561E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.757 | TFLOPs: 56.28 | 7: iteration 41830/ 44073 | consumed samples: 21416960 | consumed tokens: 43861934080 | elapsed time per iteration (s): 4.19 | learning rate: 2.117E-05 | global batch size: 512 | lm loss: 1.919127E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.336 | TFLOPs: 57.01 | 7: iteration 41840/ 44073 | consumed samples: 21422080 | consumed tokens: 43872419840 | elapsed time per iteration (s): 4.22 | learning rate: 2.116E-05 | global batch size: 512 | lm loss: 1.911238E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.320 | TFLOPs: 56.54 | 7: iteration 41850/ 44073 | consumed samples: 21427200 | consumed tokens: 43882905600 | elapsed time per iteration (s): 4.22 | learning rate: 2.115E-05 | global batch size: 512 | lm loss: 1.908284E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.300 | TFLOPs: 56.53 | 7: iteration 41860/ 44073 | consumed samples: 21432320 | consumed tokens: 43893391360 | elapsed time per iteration (s): 4.29 | learning rate: 2.114E-05 | global batch size: 512 | lm loss: 1.914225E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.256 | TFLOPs: 55.58 | 7: iteration 41870/ 44073 | consumed samples: 21437440 | consumed tokens: 43903877120 | elapsed time per iteration (s): 4.17 | learning rate: 2.113E-05 | global batch size: 512 | lm loss: 1.907816E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.840 | TFLOPs: 57.25 | 7: iteration 41880/ 44073 | consumed samples: 21442560 | consumed tokens: 43914362880 | elapsed time per iteration (s): 4.21 | learning rate: 2.112E-05 | global batch size: 512 | lm loss: 1.919938E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.513 | TFLOPs: 56.63 | 7: iteration 41890/ 44073 | consumed samples: 21447680 | consumed tokens: 43924848640 | elapsed time per iteration (s): 4.21 | learning rate: 2.111E-05 | global batch size: 512 | lm loss: 1.930108E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.501 | TFLOPs: 56.63 | 7: iteration 41900/ 44073 | consumed samples: 21452800 | consumed tokens: 43935334400 | elapsed time per iteration (s): 4.18 | learning rate: 2.110E-05 | global batch size: 512 | lm loss: 1.889799E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.629 | TFLOPs: 57.15 | 7: iteration 41910/ 44073 | consumed samples: 21457920 | consumed tokens: 43945820160 | elapsed time per iteration (s): 4.19 | learning rate: 2.109E-05 | global batch size: 512 | lm loss: 1.929205E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.237 | TFLOPs: 56.97 | 7: iteration 41920/ 44073 | consumed samples: 21463040 | consumed tokens: 43956305920 | elapsed time per iteration (s): 4.21 | learning rate: 2.108E-05 | global batch size: 512 | lm loss: 1.923523E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.743 | TFLOPs: 56.74 | 7: iteration 41930/ 44073 | consumed samples: 21468160 | consumed tokens: 43966791680 | elapsed time per iteration (s): 4.21 | learning rate: 2.107E-05 | global batch size: 512 | lm loss: 1.907732E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.634 | TFLOPs: 56.69 | 7: iteration 41940/ 44073 | consumed samples: 21473280 | consumed tokens: 43977277440 | elapsed time per iteration (s): 4.32 | learning rate: 2.106E-05 | global batch size: 512 | lm loss: 1.915077E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.568 | TFLOPs: 55.26 | 7: iteration 41950/ 44073 | consumed samples: 21478400 | consumed tokens: 43987763200 | elapsed time per iteration (s): 4.20 | learning rate: 2.105E-05 | global batch size: 512 | lm loss: 1.897285E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.791 | TFLOPs: 56.76 | 7: iteration 41960/ 44073 | consumed samples: 21483520 | consumed tokens: 43998248960 | elapsed time per iteration (s): 4.26 | learning rate: 2.104E-05 | global batch size: 512 | lm loss: 1.937947E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.159 | TFLOPs: 56.00 | 7: iteration 41970/ 44073 | consumed samples: 21488640 | consumed tokens: 44008734720 | elapsed time per iteration (s): 4.15 | learning rate: 2.103E-05 | global batch size: 512 | lm loss: 1.916283E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.268 | TFLOPs: 57.45 | 7: iteration 41980/ 44073 | consumed samples: 21493760 | consumed tokens: 44019220480 | elapsed time per iteration (s): 4.20 | learning rate: 2.102E-05 | global batch size: 512 | lm loss: 1.919490E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.026 | TFLOPs: 56.87 | 7: iteration 41990/ 44073 | consumed samples: 21498880 | consumed tokens: 44029706240 | elapsed time per iteration (s): 4.24 | learning rate: 2.101E-05 | global batch size: 512 | lm loss: 1.934036E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.829 | TFLOPs: 56.31 | 0: [2022-11-28 02:33:56,158] [INFO] [logging.py:68:log_dist] [Rank 0] step=42000, skipped=0, lr=[2.1001233889443858e-05, 2.1001233889443858e-05, 2.1001233889443858e-05], mom=[(0.9, 0.999), (0.9, 0.999), (0.9, 0.999)] 7: iteration 42000/ 44073 | consumed samples: 21504000 | consumed tokens: 44040192000 | elapsed time per iteration (s): 4.22 | learning rate: 2.100E-05 | global batch size: 512 | lm loss: 1.913582E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.198 | TFLOPs: 56.48 | 0: steps: 42000 loss: 1.9273 iter time (s): 4.214 samples/sec: 121.502 7: ------------------------------------------------------------------------------------------- 7: valid loss at iteration 42000 | lm loss value: 1.889789E+00 | lm loss PPL: 6.617970E+00 | 7: ------------------------------------------------------------------------------------------- 0: saving checkpoint at iteration 42000 to checkpoints_2b2 0: [2022-11-28 02:33:57,504] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step42000 is begin to save! 0: [2022-11-28 02:33:57,526] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_01-model_00-model_states.pt... 0: [2022-11-28 02:33:57,861] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_01-model_00-model_states.pt. 0: [2022-11-28 02:33:57,861] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_03-model_00-model_states.pt... 0: [2022-11-28 02:33:58,006] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_03-model_00-model_states.pt. 0: [2022-11-28 02:33:58,007] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_04-model_00-model_states.pt... 0: [2022-11-28 02:33:58,153] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_04-model_00-model_states.pt. 0: [2022-11-28 02:33:58,154] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_05-model_00-model_states.pt... 0: [2022-11-28 02:33:58,289] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_05-model_00-model_states.pt. 0: [2022-11-28 02:33:58,290] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_06-model_00-model_states.pt... 0: [2022-11-28 02:33:58,431] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_06-model_00-model_states.pt. 0: [2022-11-28 02:33:58,431] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_07-model_00-model_states.pt... 0: [2022-11-28 02:33:58,578] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_07-model_00-model_states.pt. 0: [2022-11-28 02:33:58,578] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_08-model_00-model_states.pt... 0: [2022-11-28 02:33:58,714] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_08-model_00-model_states.pt. 0: [2022-11-28 02:33:58,715] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_09-model_00-model_states.pt... 0: [2022-11-28 02:33:58,855] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_09-model_00-model_states.pt. 0: [2022-11-28 02:33:58,855] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_10-model_00-model_states.pt... 0: [2022-11-28 02:33:58,997] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_10-model_00-model_states.pt. 0: [2022-11-28 02:33:58,998] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_11-model_00-model_states.pt... 0: [2022-11-28 02:33:59,128] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_11-model_00-model_states.pt. 0: [2022-11-28 02:33:59,128] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_12-model_00-model_states.pt... 0: [2022-11-28 02:33:59,266] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_12-model_00-model_states.pt. 0: [2022-11-28 02:33:59,267] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_13-model_00-model_states.pt... 0: [2022-11-28 02:33:59,407] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_13-model_00-model_states.pt. 0: [2022-11-28 02:33:59,408] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_14-model_00-model_states.pt... 0: [2022-11-28 02:33:59,536] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_14-model_00-model_states.pt. 0: [2022-11-28 02:33:59,537] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_15-model_00-model_states.pt... 0: [2022-11-28 02:33:59,676] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_15-model_00-model_states.pt. 0: [2022-11-28 02:33:59,676] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_16-model_00-model_states.pt... 0: [2022-11-28 02:33:59,814] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_16-model_00-model_states.pt. 0: [2022-11-28 02:33:59,814] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_17-model_00-model_states.pt... 0: [2022-11-28 02:33:59,945] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_17-model_00-model_states.pt. 0: [2022-11-28 02:33:59,946] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_18-model_00-model_states.pt... 0: [2022-11-28 02:34:00,086] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_18-model_00-model_states.pt. 0: [2022-11-28 02:34:00,087] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_19-model_00-model_states.pt... 0: [2022-11-28 02:34:00,225] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_19-model_00-model_states.pt. 0: [2022-11-28 02:34:00,225] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_20-model_00-model_states.pt... 0: [2022-11-28 02:34:00,358] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_20-model_00-model_states.pt. 0: [2022-11-28 02:34:00,359] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_21-model_00-model_states.pt... 0: [2022-11-28 02:34:00,494] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_21-model_00-model_states.pt. 0: [2022-11-28 02:34:00,495] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_22-model_00-model_states.pt... 0: [2022-11-28 02:34:00,632] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_22-model_00-model_states.pt. 0: [2022-11-28 02:34:00,633] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_23-model_00-model_states.pt... 0: [2022-11-28 02:34:00,763] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_23-model_00-model_states.pt. 0: [2022-11-28 02:34:00,763] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_24-model_00-model_states.pt... 0: [2022-11-28 02:34:00,902] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_24-model_00-model_states.pt. 0: [2022-11-28 02:34:00,902] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_25-model_00-model_states.pt... 0: [2022-11-28 02:34:01,041] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_25-model_00-model_states.pt. 0: [2022-11-28 02:34:01,041] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_26-model_00-model_states.pt... 0: [2022-11-28 02:34:01,176] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_26-model_00-model_states.pt. 0: [2022-11-28 02:34:01,177] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_27-model_00-model_states.pt... 0: [2022-11-28 02:34:01,310] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_27-model_00-model_states.pt. 0: [2022-11-28 02:34:01,310] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_28-model_00-model_states.pt... 0: [2022-11-28 02:34:01,449] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_28-model_00-model_states.pt. 0: [2022-11-28 02:34:01,450] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_29-model_00-model_states.pt... 0: [2022-11-28 02:34:01,578] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_29-model_00-model_states.pt. 0: [2022-11-28 02:34:01,579] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_30-model_00-model_states.pt... 0: [2022-11-28 02:34:01,711] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_30-model_00-model_states.pt. 0: [2022-11-28 02:34:01,711] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_31-model_00-model_states.pt... 0: [2022-11-28 02:34:01,847] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_31-model_00-model_states.pt. 0: [2022-11-28 02:34:01,847] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_32-model_00-model_states.pt... 0: [2022-11-28 02:34:01,979] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_32-model_00-model_states.pt. 0: [2022-11-28 02:34:01,980] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_33-model_00-model_states.pt... 0: [2022-11-28 02:34:02,113] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_33-model_00-model_states.pt. 0: [2022-11-28 02:34:02,114] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_34-model_00-model_states.pt... 0: [2022-11-28 02:34:02,255] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_34-model_00-model_states.pt. 0: [2022-11-28 02:34:02,255] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/layer_36-model_00-model_states.pt... 0: [2022-11-28 02:34:02,256] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/layer_36-model_00-model_states.pt. 0: [2022-11-28 02:34:02,257] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step42000/mp_rank_00_model_states.pt 0: [2022-11-28 02:34:02,257] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/mp_rank_00_model_states.pt... 0: [2022-11-28 02:34:02,262] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/mp_rank_00_model_states.pt. 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 4: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 2: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 7: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 3: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 5: [2022-11-28 02:34:02,283] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step42000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 0: [2022-11-28 02:34:02,854] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,854] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:02,854] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: [2022-11-28 02:34:02,854] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,854] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:02,855] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: [2022-11-28 02:34:02,855] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,856] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,856] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:02,856] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: [2022-11-28 02:34:02,861] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,861] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:02,861] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,868] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,868] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,868] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,868] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,868] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,868] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,879] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,879] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,879] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,879] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,879] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,879] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: [2022-11-28 02:34:02,889] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,889] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:02,889] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: [2022-11-28 02:34:02,890] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,890] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:02,890] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,894] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,894] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,894] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,911] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,911] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,911] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,915] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,915] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,915] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,915] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,915] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,915] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,916] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,916] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,916] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,916] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,916] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,916] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,916] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,916] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,916] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,917] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,917] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,917] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 5: [2022-11-28 02:34:02,922] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-28 02:34:02,922] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-28 02:34:02,922] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,933] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,933] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,933] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-28 02:34:02,933] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,933] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,933] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-28 02:34:02,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 4: [2022-11-28 02:34:02,933] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: [2022-11-28 02:34:02,964] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-28 02:34:02,964] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:02,964] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,031] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,031] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,031] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,031] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,031] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,031] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,031] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,031] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,031] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,032] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,032] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,032] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,032] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,032] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,032] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,033] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,033] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,033] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-28 02:34:03,033] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,033] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,033] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-28 02:34:03,033] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,033] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 2: [2022-11-28 02:34:03,033] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 6: [2022-11-28 02:34:03,071] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 7: [2022-11-28 02:34:03,189] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: [2022-11-28 02:34:03,356] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-28 02:34:03,357] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,522] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,522] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,522] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,522] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,522] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,522] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,531] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,531] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,531] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,531] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,531] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-28 02:34:03,531] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,532] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,532] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,532] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 1: [2022-11-28 02:34:03,532] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-28 02:34:03,532] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,540] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,540] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,540] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,540] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,541] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,541] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 3: [2022-11-28 02:34:03,541] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-28 02:34:03,541] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step42000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 3: [2022-11-28 02:34:03,541] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step42000 is ready now! 0: successfully saved checkpoint at iteration 42000 to checkpoints_2b2 7: time (ms) | save-checkpoint: 6060.67 7: iteration 42010/ 44073 | consumed samples: 21509120 | consumed tokens: 44050677760 | elapsed time per iteration (s): 4.94 | learning rate: 2.099E-05 | global batch size: 512 | lm loss: 1.919028E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 103.547 | TFLOPs: 48.26 | 7: iteration 42020/ 44073 | consumed samples: 21514240 | consumed tokens: 44061163520 | elapsed time per iteration (s): 4.21 | learning rate: 2.098E-05 | global batch size: 512 | lm loss: 1.936232E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.548 | TFLOPs: 56.65 | 7: iteration 42030/ 44073 | consumed samples: 21519360 | consumed tokens: 44071649280 | elapsed time per iteration (s): 4.18 | learning rate: 2.097E-05 | global batch size: 512 | lm loss: 1.919527E+00 | grad norm: 0.118 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.475 | TFLOPs: 57.08 | 7: iteration 42040/ 44073 | consumed samples: 21524480 | consumed tokens: 44082135040 | elapsed time per iteration (s): 4.21 | learning rate: 2.096E-05 | global batch size: 512 | lm loss: 1.931434E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.593 | TFLOPs: 56.67 | 7: iteration 42050/ 44073 | consumed samples: 21529600 | consumed tokens: 44092620800 | elapsed time per iteration (s): 4.23 | learning rate: 2.095E-05 | global batch size: 512 | lm loss: 1.909001E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.000 | TFLOPs: 56.39 | 7: iteration 42060/ 44073 | consumed samples: 21534720 | consumed tokens: 44103106560 | elapsed time per iteration (s): 4.19 | learning rate: 2.094E-05 | global batch size: 512 | lm loss: 1.916594E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.153 | TFLOPs: 56.93 | 7: iteration 42070/ 44073 | consumed samples: 21539840 | consumed tokens: 44113592320 | elapsed time per iteration (s): 4.25 | learning rate: 2.093E-05 | global batch size: 512 | lm loss: 1.927744E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.561 | TFLOPs: 56.19 | 7: iteration 42080/ 44073 | consumed samples: 21544960 | consumed tokens: 44124078080 | elapsed time per iteration (s): 4.23 | learning rate: 2.093E-05 | global batch size: 512 | lm loss: 1.919895E+00 | grad norm: 0.119 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.978 | TFLOPs: 56.38 | 7: iteration 42090/ 44073 | consumed samples: 21550080 | consumed tokens: 44134563840 | elapsed time per iteration (s): 4.19 | learning rate: 2.092E-05 | global batch size: 512 | lm loss: 1.926490E+00 | grad norm: 0.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.193 | TFLOPs: 56.95 | 7: iteration 42100/ 44073 | consumed samples: 21555200 | consumed tokens: 44145049600 | elapsed time per iteration (s): 4.24 | learning rate: 2.091E-05 | global batch size: 512 | lm loss: 1.925768E+00 | grad norm: 0.117 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.840 | TFLOPs: 56.32 | 7: iteration 42110/ 44073 | consumed samples: 21560320 | consumed tokens: 44155535360 | elapsed time per iteration (s): 4.19 | learning rate: 2.090E-05 | global batch size: 512 | lm loss: 1.912152E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.169 | TFLOPs: 56.94 | 7: iteration 42120/ 44073 | consumed samples: 21565440 | consumed tokens: 44166021120 | elapsed time per iteration (s): 4.22 | learning rate: 2.089E-05 | global batch size: 512 | lm loss: 1.935451E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.391 | TFLOPs: 56.57 | 7: iteration 42130/ 44073 | consumed samples: 21570560 | consumed tokens: 44176506880 | elapsed time per iteration (s): 4.18 | learning rate: 2.088E-05 | global batch size: 512 | lm loss: 1.916779E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.579 | TFLOPs: 57.13 | 7: iteration 42140/ 44073 | consumed samples: 21575680 | consumed tokens: 44186992640 | elapsed time per iteration (s): 4.18 | learning rate: 2.087E-05 | global batch size: 512 | lm loss: 1.929431E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.458 | TFLOPs: 57.07 | 7: iteration 42150/ 44073 | consumed samples: 21580800 | consumed tokens: 44197478400 | elapsed time per iteration (s): 4.21 | learning rate: 2.086E-05 | global batch size: 512 | lm loss: 1.940843E+00 | grad norm: 0.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.735 | TFLOPs: 56.73 | 7: iteration 42160/ 44073 | consumed samples: 21585920 | consumed tokens: 44207964160 | elapsed time per iteration (s): 4.20 | learning rate: 2.085E-05 | global batch size: 512 | lm loss: 1.931703E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.946 | TFLOPs: 56.83 | 7: iteration 42170/ 44073 | consumed samples: 21591040 | consumed tokens: 44218449920 | elapsed time per iteration (s): 4.24 | learning rate: 2.084E-05 | global batch size: 512 | lm loss: 1.916632E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.702 | TFLOPs: 56.25 | 7: iteration 42180/ 44073 | consumed samples: 21596160 | consumed tokens: 44228935680 | elapsed time per iteration (s): 4.24 | learning rate: 2.084E-05 | global batch size: 512 | lm loss: 1.907963E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.635 | TFLOPs: 56.22 | 7: iteration 42190/ 44073 | consumed samples: 21601280 | consumed tokens: 44239421440 | elapsed time per iteration (s): 4.23 | learning rate: 2.083E-05 | global batch size: 512 | lm loss: 1.944014E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.067 | TFLOPs: 56.42 | 7: iteration 42200/ 44073 | consumed samples: 21606400 | consumed tokens: 44249907200 | elapsed time per iteration (s): 4.16 | learning rate: 2.082E-05 | global batch size: 512 | lm loss: 1.933129E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.215 | TFLOPs: 57.42 | 7: iteration 42210/ 44073 | consumed samples: 21611520 | consumed tokens: 44260392960 | elapsed time per iteration (s): 4.18 | learning rate: 2.081E-05 | global batch size: 512 | lm loss: 1.916373E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.418 | TFLOPs: 57.05 | 7: iteration 42220/ 44073 | consumed samples: 21616640 | consumed tokens: 44270878720 | elapsed time per iteration (s): 4.22 | learning rate: 2.080E-05 | global batch size: 512 | lm loss: 1.922831E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.417 | TFLOPs: 56.59 | 7: iteration 42230/ 44073 | consumed samples: 21621760 | consumed tokens: 44281364480 | elapsed time per iteration (s): 4.18 | learning rate: 2.079E-05 | global batch size: 512 | lm loss: 1.924169E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.594 | TFLOPs: 57.14 | 7: iteration 42240/ 44073 | consumed samples: 21626880 | consumed tokens: 44291850240 | elapsed time per iteration (s): 4.21 | learning rate: 2.078E-05 | global batch size: 512 | lm loss: 1.910700E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.520 | TFLOPs: 56.63 | 7: iteration 42250/ 44073 | consumed samples: 21632000 | consumed tokens: 44302336000 | elapsed time per iteration (s): 4.26 | learning rate: 2.077E-05 | global batch size: 512 | lm loss: 1.910538E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.274 | TFLOPs: 56.05 | 7: iteration 42260/ 44073 | consumed samples: 21637120 | consumed tokens: 44312821760 | elapsed time per iteration (s): 4.21 | learning rate: 2.077E-05 | global batch size: 512 | lm loss: 1.932654E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.605 | TFLOPs: 56.67 | 7: iteration 42270/ 44073 | consumed samples: 21642240 | consumed tokens: 44323307520 | elapsed time per iteration (s): 4.19 | learning rate: 2.076E-05 | global batch size: 512 | lm loss: 1.930414E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.300 | TFLOPs: 57.00 | 7: iteration 42280/ 44073 | consumed samples: 21647360 | consumed tokens: 44333793280 | elapsed time per iteration (s): 4.22 | learning rate: 2.075E-05 | global batch size: 512 | lm loss: 1.898871E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.383 | TFLOPs: 56.57 | 7: iteration 42290/ 44073 | consumed samples: 21652480 | consumed tokens: 44344279040 | elapsed time per iteration (s): 4.25 | learning rate: 2.074E-05 | global batch size: 512 | lm loss: 1.906228E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.550 | TFLOPs: 56.18 | 7: iteration 42300/ 44073 | consumed samples: 21657600 | consumed tokens: 44354764800 | elapsed time per iteration (s): 4.21 | learning rate: 2.073E-05 | global batch size: 512 | lm loss: 1.915265E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.535 | TFLOPs: 56.64 | 7: iteration 42310/ 44073 | consumed samples: 21662720 | consumed tokens: 44365250560 | elapsed time per iteration (s): 4.24 | learning rate: 2.072E-05 | global batch size: 512 | lm loss: 1.913025E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.657 | TFLOPs: 56.23 | 7: iteration 42320/ 44073 | consumed samples: 21667840 | consumed tokens: 44375736320 | elapsed time per iteration (s): 4.21 | learning rate: 2.072E-05 | global batch size: 512 | lm loss: 1.906106E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.612 | TFLOPs: 56.68 | 7: iteration 42330/ 44073 | consumed samples: 21672960 | consumed tokens: 44386222080 | elapsed time per iteration (s): 4.22 | learning rate: 2.071E-05 | global batch size: 512 | lm loss: 1.910707E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.309 | TFLOPs: 56.54 | 7: iteration 42340/ 44073 | consumed samples: 21678080 | consumed tokens: 44396707840 | elapsed time per iteration (s): 4.26 | learning rate: 2.070E-05 | global batch size: 512 | lm loss: 1.914742E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.107 | TFLOPs: 55.98 | 7: iteration 42350/ 44073 | consumed samples: 21683200 | consumed tokens: 44407193600 | elapsed time per iteration (s): 4.20 | learning rate: 2.069E-05 | global batch size: 512 | lm loss: 1.924063E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.760 | TFLOPs: 56.75 | 7: iteration 42360/ 44073 | consumed samples: 21688320 | consumed tokens: 44417679360 | elapsed time per iteration (s): 4.22 | learning rate: 2.068E-05 | global batch size: 512 | lm loss: 1.924841E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.381 | TFLOPs: 56.57 | 7: iteration 42370/ 44073 | consumed samples: 21693440 | consumed tokens: 44428165120 | elapsed time per iteration (s): 4.20 | learning rate: 2.068E-05 | global batch size: 512 | lm loss: 1.910478E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.780 | TFLOPs: 56.76 | 7: iteration 42380/ 44073 | consumed samples: 21698560 | consumed tokens: 44438650880 | elapsed time per iteration (s): 4.21 | learning rate: 2.067E-05 | global batch size: 512 | lm loss: 1.924049E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.490 | TFLOPs: 56.62 | 7: iteration 42390/ 44073 | consumed samples: 21703680 | consumed tokens: 44449136640 | elapsed time per iteration (s): 4.19 | learning rate: 2.066E-05 | global batch size: 512 | lm loss: 1.930435E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.225 | TFLOPs: 56.96 | 7: iteration 42400/ 44073 | consumed samples: 21708800 | consumed tokens: 44459622400 | elapsed time per iteration (s): 4.17 | learning rate: 2.065E-05 | global batch size: 512 | lm loss: 1.934303E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.898 | TFLOPs: 57.28 | 7: iteration 42410/ 44073 | consumed samples: 21713920 | consumed tokens: 44470108160 | elapsed time per iteration (s): 4.18 | learning rate: 2.064E-05 | global batch size: 512 | lm loss: 1.913072E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.565 | TFLOPs: 57.12 | 7: iteration 42420/ 44073 | consumed samples: 21719040 | consumed tokens: 44480593920 | elapsed time per iteration (s): 4.22 | learning rate: 2.064E-05 | global batch size: 512 | lm loss: 1.921010E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.285 | TFLOPs: 56.53 | 7: iteration 42430/ 44073 | consumed samples: 21724160 | consumed tokens: 44491079680 | elapsed time per iteration (s): 4.18 | learning rate: 2.063E-05 | global batch size: 512 | lm loss: 1.927734E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.558 | TFLOPs: 57.12 | 7: iteration 42440/ 44073 | consumed samples: 21729280 | consumed tokens: 44501565440 | elapsed time per iteration (s): 4.21 | learning rate: 2.062E-05 | global batch size: 512 | lm loss: 1.922032E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.659 | TFLOPs: 56.70 | 7: iteration 42450/ 44073 | consumed samples: 21734400 | consumed tokens: 44512051200 | elapsed time per iteration (s): 4.24 | learning rate: 2.061E-05 | global batch size: 512 | lm loss: 1.920234E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.685 | TFLOPs: 56.25 | 7: iteration 42460/ 44073 | consumed samples: 21739520 | consumed tokens: 44522536960 | elapsed time per iteration (s): 4.19 | learning rate: 2.061E-05 | global batch size: 512 | lm loss: 1.912646E+00 | grad norm: 0.121 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.233 | TFLOPs: 56.97 | 7: iteration 42470/ 44073 | consumed samples: 21744640 | consumed tokens: 44533022720 | elapsed time per iteration (s): 4.23 | learning rate: 2.060E-05 | global batch size: 512 | lm loss: 1.935644E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.116 | TFLOPs: 56.45 | 7: iteration 42480/ 44073 | consumed samples: 21749760 | consumed tokens: 44543508480 | elapsed time per iteration (s): 4.19 | learning rate: 2.059E-05 | global batch size: 512 | lm loss: 1.913364E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.052 | TFLOPs: 56.88 | 7: iteration 42490/ 44073 | consumed samples: 21754880 | consumed tokens: 44553994240 | elapsed time per iteration (s): 4.23 | learning rate: 2.058E-05 | global batch size: 512 | lm loss: 1.927195E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.005 | TFLOPs: 56.39 | 7: iteration 42500/ 44073 | consumed samples: 21760000 | consumed tokens: 44564480000 | elapsed time per iteration (s): 4.15 | learning rate: 2.058E-05 | global batch size: 512 | lm loss: 1.928862E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.335 | TFLOPs: 57.48 | 7: iteration 42510/ 44073 | consumed samples: 21765120 | consumed tokens: 44574965760 | elapsed time per iteration (s): 4.18 | learning rate: 2.057E-05 | global batch size: 512 | lm loss: 1.929106E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.490 | TFLOPs: 57.09 | 7: iteration 42520/ 44073 | consumed samples: 21770240 | consumed tokens: 44585451520 | elapsed time per iteration (s): 4.22 | learning rate: 2.056E-05 | global batch size: 512 | lm loss: 1.931099E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.304 | TFLOPs: 56.53 | 7: iteration 42530/ 44073 | consumed samples: 21775360 | consumed tokens: 44595937280 | elapsed time per iteration (s): 4.30 | learning rate: 2.056E-05 | global batch size: 512 | lm loss: 1.890205E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.932 | TFLOPs: 55.43 | 7: iteration 42540/ 44073 | consumed samples: 21780480 | consumed tokens: 44606423040 | elapsed time per iteration (s): 4.46 | learning rate: 2.055E-05 | global batch size: 512 | lm loss: 1.902509E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 114.835 | TFLOPs: 53.52 | 7: iteration 42550/ 44073 | consumed samples: 21785600 | consumed tokens: 44616908800 | elapsed time per iteration (s): 4.19 | learning rate: 2.054E-05 | global batch size: 512 | lm loss: 1.920559E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.247 | TFLOPs: 56.97 | 7: iteration 42560/ 44073 | consumed samples: 21790720 | consumed tokens: 44627394560 | elapsed time per iteration (s): 4.19 | learning rate: 2.053E-05 | global batch size: 512 | lm loss: 1.905474E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.064 | TFLOPs: 56.89 | 7: iteration 42570/ 44073 | consumed samples: 21795840 | consumed tokens: 44637880320 | elapsed time per iteration (s): 4.20 | learning rate: 2.053E-05 | global batch size: 512 | lm loss: 1.915718E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.785 | TFLOPs: 56.76 | 7: iteration 42580/ 44073 | consumed samples: 21800960 | consumed tokens: 44648366080 | elapsed time per iteration (s): 4.23 | learning rate: 2.052E-05 | global batch size: 512 | lm loss: 1.928911E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.147 | TFLOPs: 56.46 | 7: iteration 42590/ 44073 | consumed samples: 21806080 | consumed tokens: 44658851840 | elapsed time per iteration (s): 4.22 | learning rate: 2.051E-05 | global batch size: 512 | lm loss: 1.905084E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.264 | TFLOPs: 56.52 | 7: iteration 42600/ 44073 | consumed samples: 21811200 | consumed tokens: 44669337600 | elapsed time per iteration (s): 4.28 | learning rate: 2.051E-05 | global batch size: 512 | lm loss: 1.923034E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.726 | TFLOPs: 55.80 | 7: iteration 42610/ 44073 | consumed samples: 21816320 | consumed tokens: 44679823360 | elapsed time per iteration (s): 4.20 | learning rate: 2.050E-05 | global batch size: 512 | lm loss: 1.931844E+00 | grad norm: 0.140 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.017 | TFLOPs: 56.87 | 7: iteration 42620/ 44073 | consumed samples: 21821440 | consumed tokens: 44690309120 | elapsed time per iteration (s): 4.22 | learning rate: 2.049E-05 | global batch size: 512 | lm loss: 1.922338E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.433 | TFLOPs: 56.59 | 7: iteration 42630/ 44073 | consumed samples: 21826560 | consumed tokens: 44700794880 | elapsed time per iteration (s): 4.19 | learning rate: 2.049E-05 | global batch size: 512 | lm loss: 1.925848E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.092 | TFLOPs: 56.90 | 7: iteration 42640/ 44073 | consumed samples: 21831680 | consumed tokens: 44711280640 | elapsed time per iteration (s): 4.17 | learning rate: 2.048E-05 | global batch size: 512 | lm loss: 1.910921E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.824 | TFLOPs: 57.24 | 7: iteration 42650/ 44073 | consumed samples: 21836800 | consumed tokens: 44721766400 | elapsed time per iteration (s): 4.22 | learning rate: 2.047E-05 | global batch size: 512 | lm loss: 1.923260E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.270 | TFLOPs: 56.52 | 7: iteration 42660/ 44073 | consumed samples: 21841920 | consumed tokens: 44732252160 | elapsed time per iteration (s): 4.19 | learning rate: 2.047E-05 | global batch size: 512 | lm loss: 1.922775E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.062 | TFLOPs: 56.89 | 7: iteration 42670/ 44073 | consumed samples: 21847040 | consumed tokens: 44742737920 | elapsed time per iteration (s): 4.19 | learning rate: 2.046E-05 | global batch size: 512 | lm loss: 1.925633E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.108 | TFLOPs: 56.91 | 7: iteration 42680/ 44073 | consumed samples: 21852160 | consumed tokens: 44753223680 | elapsed time per iteration (s): 4.23 | learning rate: 2.045E-05 | global batch size: 512 | lm loss: 1.913041E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.013 | TFLOPs: 56.40 | 7: iteration 42690/ 44073 | consumed samples: 21857280 | consumed tokens: 44763709440 | elapsed time per iteration (s): 4.22 | learning rate: 2.045E-05 | global batch size: 512 | lm loss: 1.918990E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.305 | TFLOPs: 56.53 | 7: iteration 42700/ 44073 | consumed samples: 21862400 | consumed tokens: 44774195200 | elapsed time per iteration (s): 4.22 | learning rate: 2.044E-05 | global batch size: 512 | lm loss: 1.921129E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.252 | TFLOPs: 56.51 | 7: iteration 42710/ 44073 | consumed samples: 21867520 | consumed tokens: 44784680960 | elapsed time per iteration (s): 4.18 | learning rate: 2.043E-05 | global batch size: 512 | lm loss: 1.908019E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.568 | TFLOPs: 57.12 | 7: iteration 42720/ 44073 | consumed samples: 21872640 | consumed tokens: 44795166720 | elapsed time per iteration (s): 4.23 | learning rate: 2.043E-05 | global batch size: 512 | lm loss: 1.920806E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.179 | TFLOPs: 56.48 | 7: iteration 42730/ 44073 | consumed samples: 21877760 | consumed tokens: 44805652480 | elapsed time per iteration (s): 4.22 | learning rate: 2.042E-05 | global batch size: 512 | lm loss: 1.927040E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.222 | TFLOPs: 56.50 | 7: iteration 42740/ 44073 | consumed samples: 21882880 | consumed tokens: 44816138240 | elapsed time per iteration (s): 4.17 | learning rate: 2.041E-05 | global batch size: 512 | lm loss: 1.925783E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.926 | TFLOPs: 57.29 | 7: iteration 42750/ 44073 | consumed samples: 21888000 | consumed tokens: 44826624000 | elapsed time per iteration (s): 4.22 | learning rate: 2.041E-05 | global batch size: 512 | lm loss: 1.920049E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.397 | TFLOPs: 56.58 | 7: iteration 42760/ 44073 | consumed samples: 21893120 | consumed tokens: 44837109760 | elapsed time per iteration (s): 4.17 | learning rate: 2.040E-05 | global batch size: 512 | lm loss: 1.938146E+00 | grad norm: 0.139 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.856 | TFLOPs: 57.26 | 7: iteration 42770/ 44073 | consumed samples: 21898240 | consumed tokens: 44847595520 | elapsed time per iteration (s): 4.19 | learning rate: 2.040E-05 | global batch size: 512 | lm loss: 1.909808E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.323 | TFLOPs: 57.01 | 7: iteration 42780/ 44073 | consumed samples: 21903360 | consumed tokens: 44858081280 | elapsed time per iteration (s): 4.32 | learning rate: 2.039E-05 | global batch size: 512 | lm loss: 1.940745E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.610 | TFLOPs: 55.28 | 7: iteration 42790/ 44073 | consumed samples: 21908480 | consumed tokens: 44868567040 | elapsed time per iteration (s): 4.23 | learning rate: 2.038E-05 | global batch size: 512 | lm loss: 1.938890E+00 | grad norm: 0.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.951 | TFLOPs: 56.37 | 7: iteration 42800/ 44073 | consumed samples: 21913600 | consumed tokens: 44879052800 | elapsed time per iteration (s): 4.19 | learning rate: 2.038E-05 | global batch size: 512 | lm loss: 1.952706E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.079 | TFLOPs: 56.89 | 7: iteration 42810/ 44073 | consumed samples: 21918720 | consumed tokens: 44889538560 | elapsed time per iteration (s): 4.25 | learning rate: 2.037E-05 | global batch size: 512 | lm loss: 1.926609E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.599 | TFLOPs: 56.21 | 7: iteration 42820/ 44073 | consumed samples: 21923840 | consumed tokens: 44900024320 | elapsed time per iteration (s): 4.25 | learning rate: 2.037E-05 | global batch size: 512 | lm loss: 1.953665E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.551 | TFLOPs: 56.18 | 7: iteration 42830/ 44073 | consumed samples: 21928960 | consumed tokens: 44910510080 | elapsed time per iteration (s): 4.26 | learning rate: 2.036E-05 | global batch size: 512 | lm loss: 1.913179E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.093 | TFLOPs: 55.97 | 7: iteration 42840/ 44073 | consumed samples: 21934080 | consumed tokens: 44920995840 | elapsed time per iteration (s): 4.22 | learning rate: 2.035E-05 | global batch size: 512 | lm loss: 1.914549E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.266 | TFLOPs: 56.52 | 7: iteration 42850/ 44073 | consumed samples: 21939200 | consumed tokens: 44931481600 | elapsed time per iteration (s): 4.24 | learning rate: 2.035E-05 | global batch size: 512 | lm loss: 1.910105E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.648 | TFLOPs: 56.23 | 7: iteration 42860/ 44073 | consumed samples: 21944320 | consumed tokens: 44941967360 | elapsed time per iteration (s): 4.52 | learning rate: 2.034E-05 | global batch size: 512 | lm loss: 1.900460E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 113.350 | TFLOPs: 52.83 | 7: iteration 42870/ 44073 | consumed samples: 21949440 | consumed tokens: 44952453120 | elapsed time per iteration (s): 4.23 | learning rate: 2.034E-05 | global batch size: 512 | lm loss: 1.928608E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.177 | TFLOPs: 56.47 | 7: iteration 42880/ 44073 | consumed samples: 21954560 | consumed tokens: 44962938880 | elapsed time per iteration (s): 4.20 | learning rate: 2.033E-05 | global batch size: 512 | lm loss: 1.928420E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.936 | TFLOPs: 56.83 | 7: iteration 42890/ 44073 | consumed samples: 21959680 | consumed tokens: 44973424640 | elapsed time per iteration (s): 4.16 | learning rate: 2.033E-05 | global batch size: 512 | lm loss: 1.921764E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.222 | TFLOPs: 57.43 | 7: iteration 42900/ 44073 | consumed samples: 21964800 | consumed tokens: 44983910400 | elapsed time per iteration (s): 4.20 | learning rate: 2.032E-05 | global batch size: 512 | lm loss: 1.903213E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.808 | TFLOPs: 56.77 | 7: iteration 42910/ 44073 | consumed samples: 21969920 | consumed tokens: 44994396160 | elapsed time per iteration (s): 4.23 | learning rate: 2.032E-05 | global batch size: 512 | lm loss: 1.939183E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.104 | TFLOPs: 56.44 | 7: iteration 42920/ 44073 | consumed samples: 21975040 | consumed tokens: 45004881920 | elapsed time per iteration (s): 4.17 | learning rate: 2.031E-05 | global batch size: 512 | lm loss: 1.928310E+00 | grad norm: 0.156 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.684 | TFLOPs: 57.18 | 7: iteration 42930/ 44073 | consumed samples: 21980160 | consumed tokens: 45015367680 | elapsed time per iteration (s): 4.22 | learning rate: 2.030E-05 | global batch size: 512 | lm loss: 1.947976E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.243 | TFLOPs: 56.51 | 7: iteration 42940/ 44073 | consumed samples: 21985280 | consumed tokens: 45025853440 | elapsed time per iteration (s): 4.37 | learning rate: 2.030E-05 | global batch size: 512 | lm loss: 1.924277E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 117.292 | TFLOPs: 54.66 | 7: iteration 42950/ 44073 | consumed samples: 21990400 | consumed tokens: 45036339200 | elapsed time per iteration (s): 4.19 | learning rate: 2.029E-05 | global batch size: 512 | lm loss: 1.920122E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.234 | TFLOPs: 56.97 | 7: iteration 42960/ 44073 | consumed samples: 21995520 | consumed tokens: 45046824960 | elapsed time per iteration (s): 4.19 | learning rate: 2.029E-05 | global batch size: 512 | lm loss: 1.918090E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.313 | TFLOPs: 57.00 | 7: iteration 42970/ 44073 | consumed samples: 22000640 | consumed tokens: 45057310720 | elapsed time per iteration (s): 4.20 | learning rate: 2.028E-05 | global batch size: 512 | lm loss: 1.913765E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.941 | TFLOPs: 56.83 | 7: iteration 42980/ 44073 | consumed samples: 22005760 | consumed tokens: 45067796480 | elapsed time per iteration (s): 4.22 | learning rate: 2.028E-05 | global batch size: 512 | lm loss: 1.912103E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.207 | TFLOPs: 56.49 | 7: iteration 42990/ 44073 | consumed samples: 22010880 | consumed tokens: 45078282240 | elapsed time per iteration (s): 4.21 | learning rate: 2.027E-05 | global batch size: 512 | lm loss: 1.916392E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.513 | TFLOPs: 56.63 | 7: iteration 43000/ 44073 | consumed samples: 22016000 | consumed tokens: 45088768000 | elapsed time per iteration (s): 4.24 | learning rate: 2.027E-05 | global batch size: 512 | lm loss: 1.889983E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.869 | TFLOPs: 56.33 | 7: ------------------------------------------------------------------------------------------- 7: valid loss at iteration 43000 | lm loss value: 1.889813E+00 | lm loss PPL: 6.618131E+00 | 7: ------------------------------------------------------------------------------------------- 0: saving checkpoint at iteration 43000 to checkpoints_2b2 0: [2022-11-28 03:44:23,441] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step43000 is begin to save! 0: [2022-11-28 03:44:23,529] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_01-model_00-model_states.pt... 0: [2022-11-28 03:44:24,066] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_01-model_00-model_states.pt. 0: [2022-11-28 03:44:24,066] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_03-model_00-model_states.pt... 0: [2022-11-28 03:44:24,263] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_03-model_00-model_states.pt. 0: [2022-11-28 03:44:24,264] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_04-model_00-model_states.pt... 0: [2022-11-28 03:44:24,482] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_04-model_00-model_states.pt. 0: [2022-11-28 03:44:24,483] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_05-model_00-model_states.pt... 0: [2022-11-28 03:44:24,676] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_05-model_00-model_states.pt. 0: [2022-11-28 03:44:24,677] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_06-model_00-model_states.pt... 0: [2022-11-28 03:44:24,861] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_06-model_00-model_states.pt. 0: [2022-11-28 03:44:24,861] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_07-model_00-model_states.pt... 0: [2022-11-28 03:44:25,042] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_07-model_00-model_states.pt. 0: [2022-11-28 03:44:25,042] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_08-model_00-model_states.pt... 0: [2022-11-28 03:44:25,223] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_08-model_00-model_states.pt. 0: [2022-11-28 03:44:25,224] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_09-model_00-model_states.pt... 0: [2022-11-28 03:44:25,409] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_09-model_00-model_states.pt. 0: [2022-11-28 03:44:25,409] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_10-model_00-model_states.pt... 0: [2022-11-28 03:44:25,591] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_10-model_00-model_states.pt. 0: [2022-11-28 03:44:25,591] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_11-model_00-model_states.pt... 0: [2022-11-28 03:44:25,773] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_11-model_00-model_states.pt. 0: [2022-11-28 03:44:25,773] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_12-model_00-model_states.pt... 0: [2022-11-28 03:44:25,954] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_12-model_00-model_states.pt. 0: [2022-11-28 03:44:25,955] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_13-model_00-model_states.pt... 0: [2022-11-28 03:44:26,155] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_13-model_00-model_states.pt. 0: [2022-11-28 03:44:26,156] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_14-model_00-model_states.pt... 0: [2022-11-28 03:44:26,355] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_14-model_00-model_states.pt. 0: [2022-11-28 03:44:26,355] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_15-model_00-model_states.pt... 0: [2022-11-28 03:44:26,553] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_15-model_00-model_states.pt. 0: [2022-11-28 03:44:26,553] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_16-model_00-model_states.pt... 0: [2022-11-28 03:44:26,958] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_16-model_00-model_states.pt. 0: [2022-11-28 03:44:26,958] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_17-model_00-model_states.pt... 0: [2022-11-28 03:44:27,342] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_17-model_00-model_states.pt. 0: [2022-11-28 03:44:27,342] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_18-model_00-model_states.pt... 0: [2022-11-28 03:44:27,726] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_18-model_00-model_states.pt. 0: [2022-11-28 03:44:27,726] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_19-model_00-model_states.pt... 0: [2022-11-28 03:44:28,018] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_19-model_00-model_states.pt. 0: [2022-11-28 03:44:28,019] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_20-model_00-model_states.pt... 0: [2022-11-28 03:44:28,157] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_20-model_00-model_states.pt. 0: [2022-11-28 03:44:28,157] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_21-model_00-model_states.pt... 0: [2022-11-28 03:44:28,297] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_21-model_00-model_states.pt. 0: [2022-11-28 03:44:28,297] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_22-model_00-model_states.pt... 0: [2022-11-28 03:44:28,439] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_22-model_00-model_states.pt. 0: [2022-11-28 03:44:28,440] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_23-model_00-model_states.pt... 0: [2022-11-28 03:44:28,579] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_23-model_00-model_states.pt. 0: [2022-11-28 03:44:28,580] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_24-model_00-model_states.pt... 0: [2022-11-28 03:44:28,716] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_24-model_00-model_states.pt. 0: [2022-11-28 03:44:28,717] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_25-model_00-model_states.pt... 0: [2022-11-28 03:44:28,857] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_25-model_00-model_states.pt. 0: [2022-11-28 03:44:28,857] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_26-model_00-model_states.pt... 0: [2022-11-28 03:44:28,992] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_26-model_00-model_states.pt. 0: [2022-11-28 03:44:28,992] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_27-model_00-model_states.pt... 0: [2022-11-28 03:44:29,132] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_27-model_00-model_states.pt. 0: [2022-11-28 03:44:29,133] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_28-model_00-model_states.pt... 0: [2022-11-28 03:44:29,271] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_28-model_00-model_states.pt. 0: [2022-11-28 03:44:29,271] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_29-model_00-model_states.pt... 0: [2022-11-28 03:44:29,406] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_29-model_00-model_states.pt. 0: [2022-11-28 03:44:29,406] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_30-model_00-model_states.pt... 0: [2022-11-28 03:44:29,547] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_30-model_00-model_states.pt. 0: [2022-11-28 03:44:29,547] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_31-model_00-model_states.pt... 0: [2022-11-28 03:44:29,681] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_31-model_00-model_states.pt. 0: [2022-11-28 03:44:29,682] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_32-model_00-model_states.pt... 0: [2022-11-28 03:44:29,819] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_32-model_00-model_states.pt. 0: [2022-11-28 03:44:29,819] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_33-model_00-model_states.pt... 0: [2022-11-28 03:44:29,954] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_33-model_00-model_states.pt. 0: [2022-11-28 03:44:29,954] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_34-model_00-model_states.pt... 0: [2022-11-28 03:44:30,093] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_34-model_00-model_states.pt. 0: [2022-11-28 03:44:30,093] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/layer_36-model_00-model_states.pt... 0: [2022-11-28 03:44:30,096] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/layer_36-model_00-model_states.pt. 0: [2022-11-28 03:44:30,097] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step43000/mp_rank_00_model_states.pt 0: [2022-11-28 03:44:30,097] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/mp_rank_00_model_states.pt... 0: [2022-11-28 03:44:30,103] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/mp_rank_00_model_states.pt. 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 4: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 6: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 3: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 2: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 7: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 1: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 5: [2022-11-28 03:44:30,127] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step43000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 0: [2022-11-28 03:44:30,624] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,625] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:30,625] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,689] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,689] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,690] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: [2022-11-28 03:44:30,693] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,694] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:30,694] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: [2022-11-28 03:44:30,694] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,694] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:30,694] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: [2022-11-28 03:44:30,694] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,694] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:30,695] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,702] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,702] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,702] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,702] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,702] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,702] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: [2022-11-28 03:44:30,703] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,704] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:30,704] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: [2022-11-28 03:44:30,704] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,704] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,704] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 0: [2022-11-28 03:44:30,704] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:30,704] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:30,704] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: [2022-11-28 03:44:30,704] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,710] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,710] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,710] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,710] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,710] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,711] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,713] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,713] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,713] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,713] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,713] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,713] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,715] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,715] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,715] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,715] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,715] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,715] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,716] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,716] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,716] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,716] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,716] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,716] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,717] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,717] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,717] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,719] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,719] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,719] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,721] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,722] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,722] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,723] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,724] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,724] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,755] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,755] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,755] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,755] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,755] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,755] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 1: [2022-11-28 03:44:30,756] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-28 03:44:30,756] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-28 03:44:30,756] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,766] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,766] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,766] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,784] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,784] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-28 03:44:30,784] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,784] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-28 03:44:30,784] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 5: [2022-11-28 03:44:30,784] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 6: [2022-11-28 03:44:30,788] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,801] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,801] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,801] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,802] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,802] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,802] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 4: [2022-11-28 03:44:30,833] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-28 03:44:30,833] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-28 03:44:30,833] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 7: [2022-11-28 03:44:30,880] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,217] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,217] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,217] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,217] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,217] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,218] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,218] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,218] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 3: [2022-11-28 03:44:31,218] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-28 03:44:31,218] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-28 03:44:31,246] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-28 03:44:31,246] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: [2022-11-28 03:44:31,397] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step43000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-28 03:44:31,397] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step43000 is ready now! 0: successfully saved checkpoint at iteration 43000 to checkpoints_2b2 7: time (ms) | save-checkpoint: 8012.44 7: iteration 43010/ 44073 | consumed samples: 22021120 | consumed tokens: 45099253760 | elapsed time per iteration (s): 5.09 | learning rate: 2.026E-05 | global batch size: 512 | lm loss: 1.905743E+00 | grad norm: 0.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 100.607 | TFLOPs: 46.89 | 7: iteration 43020/ 44073 | consumed samples: 22026240 | consumed tokens: 45109739520 | elapsed time per iteration (s): 4.18 | learning rate: 2.026E-05 | global batch size: 512 | lm loss: 1.908202E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.445 | TFLOPs: 57.07 | 7: iteration 43030/ 44073 | consumed samples: 22031360 | consumed tokens: 45120225280 | elapsed time per iteration (s): 4.26 | learning rate: 2.025E-05 | global batch size: 512 | lm loss: 1.909334E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.318 | TFLOPs: 56.07 | 7: iteration 43040/ 44073 | consumed samples: 22036480 | consumed tokens: 45130711040 | elapsed time per iteration (s): 4.31 | learning rate: 2.025E-05 | global batch size: 512 | lm loss: 1.909354E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 118.767 | TFLOPs: 55.35 | 7: iteration 43050/ 44073 | consumed samples: 22041600 | consumed tokens: 45141196800 | elapsed time per iteration (s): 4.22 | learning rate: 2.024E-05 | global batch size: 512 | lm loss: 1.911905E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.308 | TFLOPs: 56.54 | 7: iteration 43060/ 44073 | consumed samples: 22046720 | consumed tokens: 45151682560 | elapsed time per iteration (s): 4.18 | learning rate: 2.024E-05 | global batch size: 512 | lm loss: 1.912877E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.544 | TFLOPs: 57.11 | 7: iteration 43070/ 44073 | consumed samples: 22051840 | consumed tokens: 45162168320 | elapsed time per iteration (s): 4.19 | learning rate: 2.023E-05 | global batch size: 512 | lm loss: 1.935667E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.051 | TFLOPs: 56.88 | 7: iteration 43080/ 44073 | consumed samples: 22056960 | consumed tokens: 45172654080 | elapsed time per iteration (s): 4.21 | learning rate: 2.023E-05 | global batch size: 512 | lm loss: 1.910971E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.553 | TFLOPs: 56.65 | 7: iteration 43090/ 44073 | consumed samples: 22062080 | consumed tokens: 45183139840 | elapsed time per iteration (s): 4.25 | learning rate: 2.023E-05 | global batch size: 512 | lm loss: 1.916445E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.394 | TFLOPs: 56.11 | 7: iteration 43100/ 44073 | consumed samples: 22067200 | consumed tokens: 45193625600 | elapsed time per iteration (s): 4.22 | learning rate: 2.022E-05 | global batch size: 512 | lm loss: 1.919114E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.448 | TFLOPs: 56.60 | 7: iteration 43110/ 44073 | consumed samples: 22072320 | consumed tokens: 45204111360 | elapsed time per iteration (s): 4.18 | learning rate: 2.022E-05 | global batch size: 512 | lm loss: 1.914929E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.443 | TFLOPs: 57.06 | 7: iteration 43120/ 44073 | consumed samples: 22077440 | consumed tokens: 45214597120 | elapsed time per iteration (s): 4.20 | learning rate: 2.021E-05 | global batch size: 512 | lm loss: 1.896997E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.838 | TFLOPs: 56.78 | 7: iteration 43130/ 44073 | consumed samples: 22082560 | consumed tokens: 45225082880 | elapsed time per iteration (s): 4.21 | learning rate: 2.021E-05 | global batch size: 512 | lm loss: 1.917536E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.496 | TFLOPs: 56.62 | 7: iteration 43140/ 44073 | consumed samples: 22087680 | consumed tokens: 45235568640 | elapsed time per iteration (s): 4.25 | learning rate: 2.020E-05 | global batch size: 512 | lm loss: 1.934889E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.444 | TFLOPs: 56.13 | 7: iteration 43150/ 44073 | consumed samples: 22092800 | consumed tokens: 45246054400 | elapsed time per iteration (s): 4.17 | learning rate: 2.020E-05 | global batch size: 512 | lm loss: 1.919712E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.729 | TFLOPs: 57.20 | 7: iteration 43160/ 44073 | consumed samples: 22097920 | consumed tokens: 45256540160 | elapsed time per iteration (s): 4.20 | learning rate: 2.019E-05 | global batch size: 512 | lm loss: 1.916859E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.029 | TFLOPs: 56.87 | 7: iteration 43170/ 44073 | consumed samples: 22103040 | consumed tokens: 45267025920 | elapsed time per iteration (s): 4.22 | learning rate: 2.019E-05 | global batch size: 512 | lm loss: 1.925926E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.397 | TFLOPs: 56.58 | 7: iteration 43180/ 44073 | consumed samples: 22108160 | consumed tokens: 45277511680 | elapsed time per iteration (s): 4.18 | learning rate: 2.019E-05 | global batch size: 512 | lm loss: 1.910532E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.601 | TFLOPs: 57.14 | 7: iteration 43190/ 44073 | consumed samples: 22113280 | consumed tokens: 45287997440 | elapsed time per iteration (s): 4.17 | learning rate: 2.018E-05 | global batch size: 512 | lm loss: 1.914539E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.778 | TFLOPs: 57.22 | 7: iteration 43200/ 44073 | consumed samples: 22118400 | consumed tokens: 45298483200 | elapsed time per iteration (s): 4.24 | learning rate: 2.018E-05 | global batch size: 512 | lm loss: 1.924381E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.893 | TFLOPs: 56.34 | 7: iteration 43210/ 44073 | consumed samples: 22123520 | consumed tokens: 45308968960 | elapsed time per iteration (s): 4.21 | learning rate: 2.017E-05 | global batch size: 512 | lm loss: 1.926835E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.729 | TFLOPs: 56.73 | 7: iteration 43220/ 44073 | consumed samples: 22128640 | consumed tokens: 45319454720 | elapsed time per iteration (s): 222.90 | learning rate: 2.017E-05 | global batch size: 512 | lm loss: 1.934545E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 2.297 | TFLOPs: 1.07 | 7: iteration 43230/ 44073 | consumed samples: 22133760 | consumed tokens: 45329940480 | elapsed time per iteration (s): 189.97 | learning rate: 2.017E-05 | global batch size: 512 | lm loss: 1.885497E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 2.695 | TFLOPs: 1.26 | 7: iteration 43240/ 44073 | consumed samples: 22138880 | consumed tokens: 45340426240 | elapsed time per iteration (s): 4.19 | learning rate: 2.016E-05 | global batch size: 512 | lm loss: 1.918261E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.318 | TFLOPs: 57.01 | 7: iteration 43250/ 44073 | consumed samples: 22144000 | consumed tokens: 45350912000 | elapsed time per iteration (s): 11.28 | learning rate: 2.016E-05 | global batch size: 512 | lm loss: 1.923240E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 45.381 | TFLOPs: 21.15 | 7: iteration 43260/ 44073 | consumed samples: 22149120 | consumed tokens: 45361397760 | elapsed time per iteration (s): 62.31 | learning rate: 2.015E-05 | global batch size: 512 | lm loss: 1.922573E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 8.217 | TFLOPs: 3.83 | 7: iteration 43270/ 44073 | consumed samples: 22154240 | consumed tokens: 45371883520 | elapsed time per iteration (s): 8.66 | learning rate: 2.015E-05 | global batch size: 512 | lm loss: 1.915654E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 59.131 | TFLOPs: 27.56 | 7: iteration 43280/ 44073 | consumed samples: 22159360 | consumed tokens: 45382369280 | elapsed time per iteration (s): 4.29 | learning rate: 2.015E-05 | global batch size: 512 | lm loss: 1.924504E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 119.368 | TFLOPs: 55.63 | 7: iteration 43290/ 44073 | consumed samples: 22164480 | consumed tokens: 45392855040 | elapsed time per iteration (s): 4.24 | learning rate: 2.014E-05 | global batch size: 512 | lm loss: 1.909291E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.626 | TFLOPs: 56.22 | 7: iteration 43300/ 44073 | consumed samples: 22169600 | consumed tokens: 45403340800 | elapsed time per iteration (s): 4.21 | learning rate: 2.014E-05 | global batch size: 512 | lm loss: 1.928015E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.627 | TFLOPs: 56.68 | 7: iteration 43310/ 44073 | consumed samples: 22174720 | consumed tokens: 45413826560 | elapsed time per iteration (s): 7.96 | learning rate: 2.014E-05 | global batch size: 512 | lm loss: 1.926004E+00 | grad norm: 0.135 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 64.332 | TFLOPs: 29.98 | 7: iteration 43320/ 44073 | consumed samples: 22179840 | consumed tokens: 45424312320 | elapsed time per iteration (s): 4.15 | learning rate: 2.013E-05 | global batch size: 512 | lm loss: 1.911923E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.384 | TFLOPs: 57.50 | 7: iteration 43330/ 44073 | consumed samples: 22184960 | consumed tokens: 45434798080 | elapsed time per iteration (s): 4.17 | learning rate: 2.013E-05 | global batch size: 512 | lm loss: 1.925282E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.755 | TFLOPs: 57.21 | 7: iteration 43340/ 44073 | consumed samples: 22190080 | consumed tokens: 45445283840 | elapsed time per iteration (s): 4.20 | learning rate: 2.013E-05 | global batch size: 512 | lm loss: 1.904461E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.882 | TFLOPs: 56.80 | 7: iteration 43350/ 44073 | consumed samples: 22195200 | consumed tokens: 45455769600 | elapsed time per iteration (s): 4.20 | learning rate: 2.012E-05 | global batch size: 512 | lm loss: 1.935439E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.926 | TFLOPs: 56.82 | 7: iteration 43360/ 44073 | consumed samples: 22200320 | consumed tokens: 45466255360 | elapsed time per iteration (s): 4.20 | learning rate: 2.012E-05 | global batch size: 512 | lm loss: 1.929273E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.811 | TFLOPs: 56.77 | 7: iteration 43370/ 44073 | consumed samples: 22205440 | consumed tokens: 45476741120 | elapsed time per iteration (s): 4.19 | learning rate: 2.012E-05 | global batch size: 512 | lm loss: 1.922509E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.227 | TFLOPs: 56.96 | 7: iteration 43380/ 44073 | consumed samples: 22210560 | consumed tokens: 45487226880 | elapsed time per iteration (s): 47.91 | learning rate: 2.011E-05 | global batch size: 512 | lm loss: 1.905338E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 10.686 | TFLOPs: 4.98 | 7: iteration 43390/ 44073 | consumed samples: 22215680 | consumed tokens: 45497712640 | elapsed time per iteration (s): 43.98 | learning rate: 2.011E-05 | global batch size: 512 | lm loss: 1.931977E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 11.643 | TFLOPs: 5.43 | 7: iteration 43400/ 44073 | consumed samples: 22220800 | consumed tokens: 45508198400 | elapsed time per iteration (s): 22.56 | learning rate: 2.011E-05 | global batch size: 512 | lm loss: 1.911573E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 22.691 | TFLOPs: 10.58 | 7: iteration 43410/ 44073 | consumed samples: 22225920 | consumed tokens: 45518684160 | elapsed time per iteration (s): 91.27 | learning rate: 2.010E-05 | global batch size: 512 | lm loss: 1.927334E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 5.610 | TFLOPs: 2.61 | 7: iteration 43420/ 44073 | consumed samples: 22231040 | consumed tokens: 45529169920 | elapsed time per iteration (s): 212.66 | learning rate: 2.010E-05 | global batch size: 512 | lm loss: 1.902569E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 2.408 | TFLOPs: 1.12 | 7: iteration 43430/ 44073 | consumed samples: 22236160 | consumed tokens: 45539655680 | elapsed time per iteration (s): 60.26 | learning rate: 2.010E-05 | global batch size: 512 | lm loss: 1.918085E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 8.497 | TFLOPs: 3.96 | 7: iteration 43440/ 44073 | consumed samples: 22241280 | consumed tokens: 45550141440 | elapsed time per iteration (s): 29.98 | learning rate: 2.009E-05 | global batch size: 512 | lm loss: 1.906261E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 17.076 | TFLOPs: 7.96 | 7: iteration 43450/ 44073 | consumed samples: 22246400 | consumed tokens: 45560627200 | elapsed time per iteration (s): 12.58 | learning rate: 2.009E-05 | global batch size: 512 | lm loss: 1.917091E+00 | grad norm: 0.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 40.687 | TFLOPs: 18.96 | 7: iteration 43460/ 44073 | consumed samples: 22251520 | consumed tokens: 45571112960 | elapsed time per iteration (s): 9.34 | learning rate: 2.009E-05 | global batch size: 512 | lm loss: 1.921770E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 54.840 | TFLOPs: 25.56 | 7: iteration 43470/ 44073 | consumed samples: 22256640 | consumed tokens: 45581598720 | elapsed time per iteration (s): 13.21 | learning rate: 2.008E-05 | global batch size: 512 | lm loss: 1.908022E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 38.773 | TFLOPs: 18.07 | 7: iteration 43480/ 44073 | consumed samples: 22261760 | consumed tokens: 45592084480 | elapsed time per iteration (s): 4.63 | learning rate: 2.008E-05 | global batch size: 512 | lm loss: 1.901163E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 110.657 | TFLOPs: 51.57 | 7: iteration 43490/ 44073 | consumed samples: 22266880 | consumed tokens: 45602570240 | elapsed time per iteration (s): 4.18 | learning rate: 2.008E-05 | global batch size: 512 | lm loss: 1.932467E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.584 | TFLOPs: 57.13 | 7: iteration 43500/ 44073 | consumed samples: 22272000 | consumed tokens: 45613056000 | elapsed time per iteration (s): 4.20 | learning rate: 2.008E-05 | global batch size: 512 | lm loss: 1.919094E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.953 | TFLOPs: 56.84 | 7: iteration 43510/ 44073 | consumed samples: 22277120 | consumed tokens: 45623541760 | elapsed time per iteration (s): 4.24 | learning rate: 2.007E-05 | global batch size: 512 | lm loss: 1.911571E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.635 | TFLOPs: 56.22 | 7: iteration 43520/ 44073 | consumed samples: 22282240 | consumed tokens: 45634027520 | elapsed time per iteration (s): 4.17 | learning rate: 2.007E-05 | global batch size: 512 | lm loss: 1.925001E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.716 | TFLOPs: 57.19 | 7: iteration 43530/ 44073 | consumed samples: 22287360 | consumed tokens: 45644513280 | elapsed time per iteration (s): 4.20 | learning rate: 2.007E-05 | global batch size: 512 | lm loss: 1.929983E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.811 | TFLOPs: 56.77 | 7: iteration 43540/ 44073 | consumed samples: 22292480 | consumed tokens: 45654999040 | elapsed time per iteration (s): 55.58 | learning rate: 2.007E-05 | global batch size: 512 | lm loss: 1.920945E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 9.211 | TFLOPs: 4.29 | 7: iteration 43550/ 44073 | consumed samples: 22297600 | consumed tokens: 45665484800 | elapsed time per iteration (s): 4.16 | learning rate: 2.006E-05 | global batch size: 512 | lm loss: 1.912018E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.072 | TFLOPs: 57.36 | 7: iteration 43560/ 44073 | consumed samples: 22302720 | consumed tokens: 45675970560 | elapsed time per iteration (s): 13.66 | learning rate: 2.006E-05 | global batch size: 512 | lm loss: 1.914371E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 37.483 | TFLOPs: 17.47 | 7: iteration 43570/ 44073 | consumed samples: 22307840 | consumed tokens: 45686456320 | elapsed time per iteration (s): 4.21 | learning rate: 2.006E-05 | global batch size: 512 | lm loss: 1.910282E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.654 | TFLOPs: 56.70 | 7: iteration 43580/ 44073 | consumed samples: 22312960 | consumed tokens: 45696942080 | elapsed time per iteration (s): 13.00 | learning rate: 2.006E-05 | global batch size: 512 | lm loss: 1.909621E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 39.389 | TFLOPs: 18.36 | 7: iteration 43590/ 44073 | consumed samples: 22318080 | consumed tokens: 45707427840 | elapsed time per iteration (s): 14.96 | learning rate: 2.005E-05 | global batch size: 512 | lm loss: 1.917027E+00 | grad norm: 0.120 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 34.226 | TFLOPs: 15.95 | 7: iteration 43600/ 44073 | consumed samples: 22323200 | consumed tokens: 45717913600 | elapsed time per iteration (s): 4.18 | learning rate: 2.005E-05 | global batch size: 512 | lm loss: 1.912819E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.460 | TFLOPs: 57.07 | 7: iteration 43610/ 44073 | consumed samples: 22328320 | consumed tokens: 45728399360 | elapsed time per iteration (s): 4.16 | learning rate: 2.005E-05 | global batch size: 512 | lm loss: 1.924347E+00 | grad norm: 0.133 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.966 | TFLOPs: 57.31 | 7: iteration 43620/ 44073 | consumed samples: 22333440 | consumed tokens: 45738885120 | elapsed time per iteration (s): 8.37 | learning rate: 2.005E-05 | global batch size: 512 | lm loss: 1.920057E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 61.162 | TFLOPs: 28.50 | 7: iteration 43630/ 44073 | consumed samples: 22338560 | consumed tokens: 45749370880 | elapsed time per iteration (s): 6.57 | learning rate: 2.005E-05 | global batch size: 512 | lm loss: 1.922702E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 77.899 | TFLOPs: 36.30 | 7: iteration 43640/ 44073 | consumed samples: 22343680 | consumed tokens: 45759856640 | elapsed time per iteration (s): 4.23 | learning rate: 2.004E-05 | global batch size: 512 | lm loss: 1.920432E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.014 | TFLOPs: 56.40 | 7: iteration 43650/ 44073 | consumed samples: 22348800 | consumed tokens: 45770342400 | elapsed time per iteration (s): 14.72 | learning rate: 2.004E-05 | global batch size: 512 | lm loss: 1.901017E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 34.775 | TFLOPs: 16.21 | 7: iteration 43660/ 44073 | consumed samples: 22353920 | consumed tokens: 45780828160 | elapsed time per iteration (s): 4.16 | learning rate: 2.004E-05 | global batch size: 512 | lm loss: 1.923581E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 123.046 | TFLOPs: 57.35 | 7: iteration 43670/ 44073 | consumed samples: 22359040 | consumed tokens: 45791313920 | elapsed time per iteration (s): 4.20 | learning rate: 2.004E-05 | global batch size: 512 | lm loss: 1.896593E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.959 | TFLOPs: 56.84 | 7: iteration 43680/ 44073 | consumed samples: 22364160 | consumed tokens: 45801799680 | elapsed time per iteration (s): 4.19 | learning rate: 2.004E-05 | global batch size: 512 | lm loss: 1.921255E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.258 | TFLOPs: 56.98 | 7: iteration 43690/ 44073 | consumed samples: 22369280 | consumed tokens: 45812285440 | elapsed time per iteration (s): 4.37 | learning rate: 2.003E-05 | global batch size: 512 | lm loss: 1.903271E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 117.170 | TFLOPs: 54.61 | 7: iteration 43700/ 44073 | consumed samples: 22374400 | consumed tokens: 45822771200 | elapsed time per iteration (s): 4.18 | learning rate: 2.003E-05 | global batch size: 512 | lm loss: 1.893303E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.557 | TFLOPs: 57.12 | 7: iteration 43710/ 44073 | consumed samples: 22379520 | consumed tokens: 45833256960 | elapsed time per iteration (s): 4.20 | learning rate: 2.003E-05 | global batch size: 512 | lm loss: 1.936944E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.851 | TFLOPs: 56.79 | 7: iteration 43720/ 44073 | consumed samples: 22384640 | consumed tokens: 45843742720 | elapsed time per iteration (s): 4.25 | learning rate: 2.003E-05 | global batch size: 512 | lm loss: 1.910358E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.432 | TFLOPs: 56.13 | 7: iteration 43730/ 44073 | consumed samples: 22389760 | consumed tokens: 45854228480 | elapsed time per iteration (s): 18.58 | learning rate: 2.003E-05 | global batch size: 512 | lm loss: 1.916046E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 27.557 | TFLOPs: 12.84 | 7: iteration 43740/ 44073 | consumed samples: 22394880 | consumed tokens: 45864714240 | elapsed time per iteration (s): 4.25 | learning rate: 2.003E-05 | global batch size: 512 | lm loss: 1.912577E+00 | grad norm: 0.137 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.578 | TFLOPs: 56.20 | 7: iteration 43750/ 44073 | consumed samples: 22400000 | consumed tokens: 45875200000 | elapsed time per iteration (s): 4.21 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.950974E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.705 | TFLOPs: 56.72 | 7: iteration 43760/ 44073 | consumed samples: 22405120 | consumed tokens: 45885685760 | elapsed time per iteration (s): 4.21 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.922402E+00 | grad norm: 0.138 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.701 | TFLOPs: 56.72 | 7: iteration 43770/ 44073 | consumed samples: 22410240 | consumed tokens: 45896171520 | elapsed time per iteration (s): 4.19 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.938958E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.075 | TFLOPs: 56.89 | 7: iteration 43780/ 44073 | consumed samples: 22415360 | consumed tokens: 45906657280 | elapsed time per iteration (s): 4.21 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.907491E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.641 | TFLOPs: 56.69 | 7: iteration 43790/ 44073 | consumed samples: 22420480 | consumed tokens: 45917143040 | elapsed time per iteration (s): 4.17 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.923240E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.727 | TFLOPs: 57.20 | 7: iteration 43800/ 44073 | consumed samples: 22425600 | consumed tokens: 45927628800 | elapsed time per iteration (s): 4.24 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.931530E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.656 | TFLOPs: 56.23 | 7: iteration 43810/ 44073 | consumed samples: 22430720 | consumed tokens: 45938114560 | elapsed time per iteration (s): 4.24 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.907748E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.678 | TFLOPs: 56.24 | 7: iteration 43820/ 44073 | consumed samples: 22435840 | consumed tokens: 45948600320 | elapsed time per iteration (s): 4.21 | learning rate: 2.002E-05 | global batch size: 512 | lm loss: 1.932719E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.757 | TFLOPs: 56.75 | 7: iteration 43830/ 44073 | consumed samples: 22440960 | consumed tokens: 45959086080 | elapsed time per iteration (s): 4.21 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.924596E+00 | grad norm: 0.142 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.709 | TFLOPs: 56.72 | 7: iteration 43840/ 44073 | consumed samples: 22446080 | consumed tokens: 45969571840 | elapsed time per iteration (s): 4.21 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.921932E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.589 | TFLOPs: 56.67 | 7: iteration 43850/ 44073 | consumed samples: 22451200 | consumed tokens: 45980057600 | elapsed time per iteration (s): 4.23 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.916983E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.032 | TFLOPs: 56.41 | 7: iteration 43860/ 44073 | consumed samples: 22456320 | consumed tokens: 45990543360 | elapsed time per iteration (s): 4.22 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.911510E+00 | grad norm: 0.131 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.192 | TFLOPs: 56.48 | 7: iteration 43870/ 44073 | consumed samples: 22461440 | consumed tokens: 46001029120 | elapsed time per iteration (s): 4.19 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.921767E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.103 | TFLOPs: 56.91 | 7: iteration 43880/ 44073 | consumed samples: 22466560 | consumed tokens: 46011514880 | elapsed time per iteration (s): 4.21 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.916937E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.749 | TFLOPs: 56.74 | 7: iteration 43890/ 44073 | consumed samples: 22471680 | consumed tokens: 46022000640 | elapsed time per iteration (s): 4.20 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.930461E+00 | grad norm: 0.123 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.832 | TFLOPs: 56.78 | 7: iteration 43900/ 44073 | consumed samples: 22476800 | consumed tokens: 46032486400 | elapsed time per iteration (s): 4.23 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.928427E+00 | grad norm: 0.127 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.020 | TFLOPs: 56.40 | 7: iteration 43910/ 44073 | consumed samples: 22481920 | consumed tokens: 46042972160 | elapsed time per iteration (s): 4.21 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.929461E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.757 | TFLOPs: 56.74 | 7: iteration 43920/ 44073 | consumed samples: 22487040 | consumed tokens: 46053457920 | elapsed time per iteration (s): 4.21 | learning rate: 2.001E-05 | global batch size: 512 | lm loss: 1.902746E+00 | grad norm: 0.129 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.471 | TFLOPs: 56.61 | 7: iteration 43930/ 44073 | consumed samples: 22492160 | consumed tokens: 46063943680 | elapsed time per iteration (s): 4.22 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.915404E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.234 | TFLOPs: 56.50 | 7: iteration 43940/ 44073 | consumed samples: 22497280 | consumed tokens: 46074429440 | elapsed time per iteration (s): 4.23 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.909708E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.028 | TFLOPs: 56.41 | 7: iteration 43950/ 44073 | consumed samples: 22502400 | consumed tokens: 46084915200 | elapsed time per iteration (s): 4.22 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.897496E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.403 | TFLOPs: 56.58 | 7: iteration 43960/ 44073 | consumed samples: 22507520 | consumed tokens: 46095400960 | elapsed time per iteration (s): 4.17 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.931822E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.751 | TFLOPs: 57.21 | 7: iteration 43970/ 44073 | consumed samples: 22512640 | consumed tokens: 46105886720 | elapsed time per iteration (s): 4.17 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.900557E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 122.713 | TFLOPs: 57.19 | 7: iteration 43980/ 44073 | consumed samples: 22517760 | consumed tokens: 46116372480 | elapsed time per iteration (s): 4.21 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.895999E+00 | grad norm: 0.141 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.632 | TFLOPs: 56.69 | 7: iteration 43990/ 44073 | consumed samples: 22522880 | consumed tokens: 46126858240 | elapsed time per iteration (s): 4.25 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.928712E+00 | grad norm: 0.125 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.394 | TFLOPs: 56.11 | 0: [2022-11-28 07:56:37,125] [INFO] [logging.py:68:log_dist] [Rank 0] step=44000, skipped=0, lr=[2.0001264343016284e-05, 2.0001264343016284e-05, 2.0001264343016284e-05], mom=[(0.9, 0.999), (0.9, 0.999), (0.9, 0.999)] 7: iteration 44000/ 44073 | consumed samples: 22528000 | consumed tokens: 46137344000 | elapsed time per iteration (s): 4.22 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.921192E+00 | grad norm: 0.122 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.460 | TFLOPs: 56.61 | 0: steps: 44000 loss: 1.9235 iter time (s): 9.669 samples/sec: 52.952 7: ------------------------------------------------------------------------------------------- 7: valid loss at iteration 44000 | lm loss value: 1.878090E+00 | lm loss PPL: 6.541000E+00 | 7: ------------------------------------------------------------------------------------------- 0: saving checkpoint at iteration 44000 to checkpoints_2b2 0: [2022-11-28 07:56:38,510] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step44000 is begin to save! 0: [2022-11-28 07:56:38,606] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_01-model_00-model_states.pt... 0: [2022-11-28 07:56:39,120] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_01-model_00-model_states.pt. 0: [2022-11-28 07:56:39,120] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_03-model_00-model_states.pt... 0: [2022-11-28 07:56:39,264] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_03-model_00-model_states.pt. 0: [2022-11-28 07:56:39,265] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_04-model_00-model_states.pt... 0: [2022-11-28 07:56:39,413] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_04-model_00-model_states.pt. 0: [2022-11-28 07:56:39,413] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_05-model_00-model_states.pt... 0: [2022-11-28 07:56:39,551] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_05-model_00-model_states.pt. 0: [2022-11-28 07:56:39,552] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_06-model_00-model_states.pt... 0: [2022-11-28 07:56:39,690] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_06-model_00-model_states.pt. 0: [2022-11-28 07:56:39,691] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_07-model_00-model_states.pt... 0: [2022-11-28 07:56:39,829] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_07-model_00-model_states.pt. 0: [2022-11-28 07:56:39,830] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_08-model_00-model_states.pt... 0: [2022-11-28 07:56:39,954] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_08-model_00-model_states.pt. 0: [2022-11-28 07:56:39,954] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_09-model_00-model_states.pt... 0: [2022-11-28 07:56:40,078] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_09-model_00-model_states.pt. 0: [2022-11-28 07:56:40,079] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_10-model_00-model_states.pt... 0: [2022-11-28 07:56:40,203] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_10-model_00-model_states.pt. 0: [2022-11-28 07:56:40,204] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_11-model_00-model_states.pt... 0: [2022-11-28 07:56:40,327] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_11-model_00-model_states.pt. 0: [2022-11-28 07:56:40,328] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_12-model_00-model_states.pt... 0: [2022-11-28 07:56:40,452] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_12-model_00-model_states.pt. 0: [2022-11-28 07:56:40,453] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_13-model_00-model_states.pt... 0: [2022-11-28 07:56:40,579] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_13-model_00-model_states.pt. 0: [2022-11-28 07:56:40,580] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_14-model_00-model_states.pt... 0: [2022-11-28 07:56:40,705] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_14-model_00-model_states.pt. 0: [2022-11-28 07:56:40,705] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_15-model_00-model_states.pt... 0: [2022-11-28 07:56:40,832] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_15-model_00-model_states.pt. 0: [2022-11-28 07:56:40,832] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_16-model_00-model_states.pt... 0: [2022-11-28 07:56:40,956] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_16-model_00-model_states.pt. 0: [2022-11-28 07:56:40,957] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_17-model_00-model_states.pt... 0: [2022-11-28 07:56:41,081] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_17-model_00-model_states.pt. 0: [2022-11-28 07:56:41,082] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_18-model_00-model_states.pt... 0: [2022-11-28 07:56:41,205] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_18-model_00-model_states.pt. 0: [2022-11-28 07:56:41,206] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_19-model_00-model_states.pt... 0: [2022-11-28 07:56:41,330] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_19-model_00-model_states.pt. 0: [2022-11-28 07:56:41,330] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_20-model_00-model_states.pt... 0: [2022-11-28 07:56:41,455] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_20-model_00-model_states.pt. 0: [2022-11-28 07:56:41,456] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_21-model_00-model_states.pt... 0: [2022-11-28 07:56:41,580] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_21-model_00-model_states.pt. 0: [2022-11-28 07:56:41,580] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_22-model_00-model_states.pt... 0: [2022-11-28 07:56:41,706] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_22-model_00-model_states.pt. 0: [2022-11-28 07:56:41,706] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_23-model_00-model_states.pt... 0: [2022-11-28 07:56:41,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_23-model_00-model_states.pt. 0: [2022-11-28 07:56:41,831] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_24-model_00-model_states.pt... 0: [2022-11-28 07:56:41,955] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_24-model_00-model_states.pt. 0: [2022-11-28 07:56:41,955] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_25-model_00-model_states.pt... 0: [2022-11-28 07:56:42,080] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_25-model_00-model_states.pt. 0: [2022-11-28 07:56:42,080] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_26-model_00-model_states.pt... 0: [2022-11-28 07:56:42,204] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_26-model_00-model_states.pt. 0: [2022-11-28 07:56:42,205] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_27-model_00-model_states.pt... 0: [2022-11-28 07:56:42,331] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_27-model_00-model_states.pt. 0: [2022-11-28 07:56:42,331] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_28-model_00-model_states.pt... 0: [2022-11-28 07:56:42,456] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_28-model_00-model_states.pt. 0: [2022-11-28 07:56:42,456] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_29-model_00-model_states.pt... 0: [2022-11-28 07:56:42,581] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_29-model_00-model_states.pt. 0: [2022-11-28 07:56:42,581] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_30-model_00-model_states.pt... 0: [2022-11-28 07:56:42,706] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_30-model_00-model_states.pt. 0: [2022-11-28 07:56:42,706] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_31-model_00-model_states.pt... 0: [2022-11-28 07:56:42,830] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_31-model_00-model_states.pt. 0: [2022-11-28 07:56:42,831] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_32-model_00-model_states.pt... 0: [2022-11-28 07:56:42,955] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_32-model_00-model_states.pt. 0: [2022-11-28 07:56:42,956] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_33-model_00-model_states.pt... 0: [2022-11-28 07:56:43,080] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_33-model_00-model_states.pt. 0: [2022-11-28 07:56:43,081] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_34-model_00-model_states.pt... 0: [2022-11-28 07:56:43,205] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_34-model_00-model_states.pt. 0: [2022-11-28 07:56:43,205] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/layer_36-model_00-model_states.pt... 0: [2022-11-28 07:56:43,209] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/layer_36-model_00-model_states.pt. 0: [2022-11-28 07:56:43,210] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step44000/mp_rank_00_model_states.pt 0: [2022-11-28 07:56:43,210] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/mp_rank_00_model_states.pt... 0: [2022-11-28 07:56:43,216] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/mp_rank_00_model_states.pt. 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 4: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 6: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 3: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 1: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 5: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 7: [2022-11-28 07:56:43,238] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 0: [2022-11-28 07:56:43,756] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,757] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,757] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:43,757] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: [2022-11-28 07:56:43,765] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,765] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:43,765] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,768] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,768] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,768] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: [2022-11-28 07:56:43,768] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,768] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:43,768] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: [2022-11-28 07:56:43,787] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,787] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:43,787] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: [2022-11-28 07:56:43,790] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,790] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:43,791] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,801] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,802] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,802] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: [2022-11-28 07:56:43,816] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,817] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:43,817] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: [2022-11-28 07:56:43,817] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-28 07:56:43,818] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:43,818] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,821] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,821] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,821] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,822] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,823] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,823] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,833] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,833] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,833] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,833] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,833] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,833] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,836] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,836] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,836] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 6: [2022-11-28 07:56:43,869] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-28 07:56:43,869] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-28 07:56:43,869] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,905] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,905] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,905] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,906] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,906] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,906] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,906] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,907] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,907] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,907] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,907] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,907] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:43,918] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:43,918] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:43,918] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:43,918] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:43,918] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:43,918] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:43,918] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:43,918] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:43,918] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:43,921] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:43,921] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:43,921] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,926] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,926] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,926] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,926] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,926] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,926] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,926] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,926] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,926] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,926] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,927] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,927] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,927] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,927] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,927] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,947] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,947] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,947] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,947] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,947] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,948] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 1: [2022-11-28 07:56:43,948] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-28 07:56:43,948] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-28 07:56:43,948] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,983] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,983] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,983] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-28 07:56:43,983] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-28 07:56:43,983] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: [2022-11-28 07:56:44,074] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-28 07:56:44,074] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,095] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,095] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,095] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,095] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,095] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,096] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,096] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,096] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,111] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,111] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,111] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,111] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,111] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,111] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 5: [2022-11-28 07:56:44,111] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-28 07:56:44,111] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-28 07:56:44,111] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:44,135] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:44,135] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:44,135] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:44,135] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:44,135] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:44,135] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:44,136] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:44,136] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:44,136] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 4: [2022-11-28 07:56:44,136] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-28 07:56:44,136] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-28 07:56:44,136] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 3: [2022-11-28 07:56:44,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,219] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,219] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,219] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44000/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 7: [2022-11-28 07:56:44,220] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44000 is ready now! 0: successfully saved checkpoint at iteration 44000 to checkpoints_2b2 7: time (ms) | save-checkpoint: 5801.81 7: iteration 44010/ 44073 | consumed samples: 22533120 | consumed tokens: 46147829760 | elapsed time per iteration (s): 4.89 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.905220E+00 | grad norm: 0.128 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 104.703 | TFLOPs: 48.80 | 7: iteration 44020/ 44073 | consumed samples: 22538240 | consumed tokens: 46158315520 | elapsed time per iteration (s): 4.22 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.902008E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.199 | TFLOPs: 56.48 | 7: iteration 44030/ 44073 | consumed samples: 22543360 | consumed tokens: 46168801280 | elapsed time per iteration (s): 4.24 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.910429E+00 | grad norm: 0.124 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.861 | TFLOPs: 56.33 | 7: iteration 44040/ 44073 | consumed samples: 22548480 | consumed tokens: 46179287040 | elapsed time per iteration (s): 4.23 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.915253E+00 | grad norm: 0.126 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 121.056 | TFLOPs: 56.42 | 7: iteration 44050/ 44073 | consumed samples: 22553600 | consumed tokens: 46189772800 | elapsed time per iteration (s): 4.24 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.910248E+00 | grad norm: 0.136 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.622 | TFLOPs: 56.22 | 7: iteration 44060/ 44073 | consumed samples: 22558720 | consumed tokens: 46200258560 | elapsed time per iteration (s): 4.24 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.903738E+00 | grad norm: 0.132 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.689 | TFLOPs: 56.25 | 7: iteration 44070/ 44073 | consumed samples: 22563840 | consumed tokens: 46210744320 | elapsed time per iteration (s): 4.26 | learning rate: 2.000E-05 | global batch size: 512 | lm loss: 1.907129E+00 | grad norm: 0.130 | num zeros: 0.0 | number of skipped iterations: 0 | number of nan iterations: 0 | samples per second: 120.163 | TFLOPs: 56.00 | 0: [after training is done] datetime: 2022-11-28 08:01:53 0: saving checkpoint at iteration 44073 to checkpoints_2b2 7: ------------------------------------------------------------------------------------------------------------ 7: valid loss at the end of training for val data | lm loss value: 1.811146E+00 | lm loss PPL: 6.117451E+00 | 7: ------------------------------------------------------------------------------------------------------------ 0: [2022-11-28 08:01:54,260] [INFO] [logging.py:68:log_dist] [Rank 0] [Torch] Checkpoint global_step44073 is begin to save! 0: [2022-11-28 08:01:54,264] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_01-model_00-model_states.pt... 0: [2022-11-28 08:01:54,521] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_01-model_00-model_states.pt. 0: [2022-11-28 08:01:54,521] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_03-model_00-model_states.pt... 0: [2022-11-28 08:01:54,650] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_03-model_00-model_states.pt. 0: [2022-11-28 08:01:54,650] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_04-model_00-model_states.pt... 0: [2022-11-28 08:01:54,786] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_04-model_00-model_states.pt. 0: [2022-11-28 08:01:54,786] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_05-model_00-model_states.pt... 0: [2022-11-28 08:01:54,911] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_05-model_00-model_states.pt. 0: [2022-11-28 08:01:54,911] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_06-model_00-model_states.pt... 0: [2022-11-28 08:01:55,039] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_06-model_00-model_states.pt. 0: [2022-11-28 08:01:55,039] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_07-model_00-model_states.pt... 0: [2022-11-28 08:01:55,162] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_07-model_00-model_states.pt. 0: [2022-11-28 08:01:55,162] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_08-model_00-model_states.pt... 0: [2022-11-28 08:01:55,286] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_08-model_00-model_states.pt. 0: [2022-11-28 08:01:55,286] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_09-model_00-model_states.pt... 0: [2022-11-28 08:01:55,411] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_09-model_00-model_states.pt. 0: [2022-11-28 08:01:55,411] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_10-model_00-model_states.pt... 0: [2022-11-28 08:01:55,536] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_10-model_00-model_states.pt. 0: [2022-11-28 08:01:55,536] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_11-model_00-model_states.pt... 0: [2022-11-28 08:01:55,661] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_11-model_00-model_states.pt. 0: [2022-11-28 08:01:55,662] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_12-model_00-model_states.pt... 0: [2022-11-28 08:01:55,786] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_12-model_00-model_states.pt. 0: [2022-11-28 08:01:55,787] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_13-model_00-model_states.pt... 0: [2022-11-28 08:01:55,915] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_13-model_00-model_states.pt. 0: [2022-11-28 08:01:55,915] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_14-model_00-model_states.pt... 0: [2022-11-28 08:01:56,043] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_14-model_00-model_states.pt. 0: [2022-11-28 08:01:56,044] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_15-model_00-model_states.pt... 0: [2022-11-28 08:01:56,166] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_15-model_00-model_states.pt. 0: [2022-11-28 08:01:56,166] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_16-model_00-model_states.pt... 0: [2022-11-28 08:01:56,292] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_16-model_00-model_states.pt. 0: [2022-11-28 08:01:56,292] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_17-model_00-model_states.pt... 0: [2022-11-28 08:01:56,416] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_17-model_00-model_states.pt. 0: [2022-11-28 08:01:56,417] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_18-model_00-model_states.pt... 0: [2022-11-28 08:01:56,542] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_18-model_00-model_states.pt. 0: [2022-11-28 08:01:56,543] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_19-model_00-model_states.pt... 0: [2022-11-28 08:01:56,666] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_19-model_00-model_states.pt. 0: [2022-11-28 08:01:56,666] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_20-model_00-model_states.pt... 0: [2022-11-28 08:01:56,790] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_20-model_00-model_states.pt. 0: [2022-11-28 08:01:56,791] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_21-model_00-model_states.pt... 0: [2022-11-28 08:01:56,917] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_21-model_00-model_states.pt. 0: [2022-11-28 08:01:56,917] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_22-model_00-model_states.pt... 0: [2022-11-28 08:01:57,044] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_22-model_00-model_states.pt. 0: [2022-11-28 08:01:57,045] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_23-model_00-model_states.pt... 0: [2022-11-28 08:01:57,172] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_23-model_00-model_states.pt. 0: [2022-11-28 08:01:57,172] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_24-model_00-model_states.pt... 0: [2022-11-28 08:01:57,300] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_24-model_00-model_states.pt. 0: [2022-11-28 08:01:57,300] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_25-model_00-model_states.pt... 0: [2022-11-28 08:01:57,425] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_25-model_00-model_states.pt. 0: [2022-11-28 08:01:57,425] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_26-model_00-model_states.pt... 0: [2022-11-28 08:01:57,551] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_26-model_00-model_states.pt. 0: [2022-11-28 08:01:57,551] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_27-model_00-model_states.pt... 0: [2022-11-28 08:01:57,676] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_27-model_00-model_states.pt. 0: [2022-11-28 08:01:57,676] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_28-model_00-model_states.pt... 0: [2022-11-28 08:01:57,801] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_28-model_00-model_states.pt. 0: [2022-11-28 08:01:57,801] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_29-model_00-model_states.pt... 0: [2022-11-28 08:01:57,927] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_29-model_00-model_states.pt. 0: [2022-11-28 08:01:57,927] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_30-model_00-model_states.pt... 0: [2022-11-28 08:01:58,053] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_30-model_00-model_states.pt. 0: [2022-11-28 08:01:58,053] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_31-model_00-model_states.pt... 0: [2022-11-28 08:01:58,176] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_31-model_00-model_states.pt. 0: [2022-11-28 08:01:58,177] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_32-model_00-model_states.pt... 0: [2022-11-28 08:01:58,301] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_32-model_00-model_states.pt. 0: [2022-11-28 08:01:58,302] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_33-model_00-model_states.pt... 0: [2022-11-28 08:01:58,427] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_33-model_00-model_states.pt. 0: [2022-11-28 08:01:58,428] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_34-model_00-model_states.pt... 0: [2022-11-28 08:01:58,552] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_34-model_00-model_states.pt. 0: [2022-11-28 08:01:58,552] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/layer_36-model_00-model_states.pt... 0: [2022-11-28 08:01:58,555] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/layer_36-model_00-model_states.pt. 0: [2022-11-28 08:01:58,557] [INFO] [logging.py:68:log_dist] [Rank 0] Saving model checkpoint: checkpoints_2b2/global_step44073/mp_rank_00_model_states.pt 0: [2022-11-28 08:01:58,557] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/mp_rank_00_model_states.pt... 0: [2022-11-28 08:01:58,563] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/mp_rank_00_model_states.pt. 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt... 6: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt... 3: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt... 5: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt... 4: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt... 1: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt... 2: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt... 7: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:58,583] [INFO] [torch_checkpoint_engine.py:15:save] [Torch] Saving checkpoints_2b2/global_step44073/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt... 0: [2022-11-28 08:01:59,087] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,087] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,087] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 7: [2022-11-28 08:01:59,090] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,090] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_62_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,090] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,096] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,096] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_38_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,096] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 7: [2022-11-28 08:01:59,102] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,102] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_59_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,102] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: [2022-11-28 08:01:59,112] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,112] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,112] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,116] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt. 1: [2022-11-28 08:01:59,116] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_10_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,116] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 5: [2022-11-28 08:01:59,117] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,117] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_47_mp_rank_00_optim_states.pt 5: [2022-11-28 08:01:59,117] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,117] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,117] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_54_mp_rank_00_optim_states.pt 6: [2022-11-28 08:01:59,118] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 2: [2022-11-28 08:01:59,118] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,118] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_18_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,118] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,119] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt. 1: [2022-11-28 08:01:59,119] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_9_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,119] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,120] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,120] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_32_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,120] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: [2022-11-28 08:01:59,120] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,120] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,120] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,120] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,121] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_52_mp_rank_00_optim_states.pt 6: [2022-11-28 08:01:59,121] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 5: [2022-11-28 08:01:59,124] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,124] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_44_mp_rank_00_optim_states.pt 5: [2022-11-28 08:01:59,124] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 7: [2022-11-28 08:01:59,126] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,126] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_60_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,126] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 5: [2022-11-28 08:01:59,127] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,127] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_40_mp_rank_00_optim_states.pt 5: [2022-11-28 08:01:59,127] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 5: [2022-11-28 08:01:59,128] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,128] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_46_mp_rank_00_optim_states.pt 5: [2022-11-28 08:01:59,128] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,128] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,128] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_36_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,128] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 5: [2022-11-28 08:01:59,131] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,131] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_41_mp_rank_00_optim_states.pt 5: [2022-11-28 08:01:59,132] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,133] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt. 1: [2022-11-28 08:01:59,133] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_11_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,133] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,134] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,134] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,134] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,134] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,136] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt. 3: [2022-11-28 08:01:59,136] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_29_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,136] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,136] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt. 3: [2022-11-28 08:01:59,136] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt. 3: [2022-11-28 08:01:59,136] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_27_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,136] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_26_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,136] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,136] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,140] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,140] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_33_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,140] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: [2022-11-28 08:01:59,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,142] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: [2022-11-28 08:01:59,142] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,142] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,143] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,143] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_35_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,143] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,134] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_14_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,134] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,145] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,149] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,150] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_45_mp_rank_00_optim_states.pt 5: [2022-11-28 08:01:59,150] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,151] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,151] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_51_mp_rank_00_optim_states.pt 6: [2022-11-28 08:01:59,151] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,151] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt. 3: [2022-11-28 08:01:59,152] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_31_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,152] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,152] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,152] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,152] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_57_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,152] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_25_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,152] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,152] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,145] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_13_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,146] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,147] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt. 1: [2022-11-28 08:01:59,147] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_15_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,147] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 5: [2022-11-28 08:01:59,153] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,153] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_42_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,153] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,153] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_21_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,153] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 5: [2022-11-28 08:01:59,153] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,155] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,155] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_50_mp_rank_00_optim_states.pt 6: [2022-11-28 08:01:59,155] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,155] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,156] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt. 5: [2022-11-28 08:01:59,156] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_43_mp_rank_00_optim_states.pt 5: [2022-11-28 08:01:59,156] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 2: [2022-11-28 08:01:59,157] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,157] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_19_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,157] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,158] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,158] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_37_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,159] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,160] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,160] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_39_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,160] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,155] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_12_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,155] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 1: [2022-11-28 08:01:59,157] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt. 1: [2022-11-28 08:01:59,157] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_8_mp_rank_00_optim_states.pt 1: [2022-11-28 08:01:59,157] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 4: [2022-11-28 08:01:59,164] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt. 4: [2022-11-28 08:01:59,164] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_34_mp_rank_00_optim_states.pt 4: [2022-11-28 08:01:59,164] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,164] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,164] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_53_mp_rank_00_optim_states.pt 6: [2022-11-28 08:01:59,164] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,165] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,165] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_49_mp_rank_00_optim_states.pt 6: [2022-11-28 08:01:59,165] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: [2022-11-28 08:01:59,166] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt. 0: [2022-11-28 08:01:59,166] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,166] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: [2022-11-28 08:01:59,166] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,171] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,171] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_23_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,171] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 2: [2022-11-28 08:01:59,171] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,171] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_17_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,171] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 7: [2022-11-28 08:01:59,172] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,172] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_61_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,172] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 2: [2022-11-28 08:01:59,178] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,179] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_20_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,179] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,193] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,194] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_55_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,193] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,194] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,194] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_30_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,194] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,194] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt. 3: [2022-11-28 08:01:59,194] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt. 3: [2022-11-28 08:01:59,194] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_28_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,194] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_24_mp_rank_00_optim_states.pt 3: [2022-11-28 08:01:59,194] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 3: [2022-11-28 08:01:59,194] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 2: [2022-11-28 08:01:59,196] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,196] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_22_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,196] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 6: [2022-11-28 08:01:59,202] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt. 6: [2022-11-28 08:01:59,203] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_48_mp_rank_00_optim_states.pt 6: [2022-11-28 08:01:59,203] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 2: [2022-11-28 08:01:59,208] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt. 2: [2022-11-28 08:01:59,208] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_16_mp_rank_00_optim_states.pt 2: [2022-11-28 08:01:59,208] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: [2022-11-28 08:01:59,233] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt 0: [2022-11-28 08:01:59,233] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 7: [2022-11-28 08:01:59,249] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,249] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_56_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,249] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 7: [2022-11-28 08:01:59,249] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,249] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_63_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,250] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 7: [2022-11-28 08:01:59,249] [INFO] [torch_checkpoint_engine.py:17:save] [Torch] Saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt. 7: [2022-11-28 08:01:59,250] [INFO] [engine.py:3213:_save_zero_checkpoint] bf16_zero checkpoint saved checkpoints_2b2/global_step44073/bf16_zero_pp_rank_58_mp_rank_00_optim_states.pt 7: [2022-11-28 08:01:59,250] [INFO] [torch_checkpoint_engine.py:27:commit] [Torch] Checkpoint global_step44073 is ready now! 0: successfully saved checkpoint at iteration 44073 to checkpoints_2b2 7: ------------------------------------------------------------------------------------------------------------ 7: test loss at the end of training for test data | lm loss value: 1.927669E+00 | lm loss PPL: 6.873471E+00 | 7: ------------------------------------------------------------------------------------------------------------ END 2076210: Mon Nov 28 08:02:18 EET 2022