2020-06-11 20:32:12,836 - crisis_transformers.trainer - INFO - Use pytorch device: cuda, with gpu_number=2
2020-06-11 20:32:14,855 - crisis_transformers.trainer - INFO -    Warmup-steps: 55716
2020-06-11 20:32:14,856 - crisis_transformers.trainer - INFO - ***** Running training *****
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Num of training examples (actually iterations per epoch for Iterable Dataset) = 69642
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Steps per Epoch = 17411 or iterations per epoch = 17411
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Num of Epochs = 16
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -inf
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Eval every 400 steps or every 400 iterations
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Early stop = 20
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Total optimization steps = 278576
2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 20:39:27,466 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=400
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -96754087231488.0
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 13s
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Steps = 400/278576
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   dev_loss = 32.203194	||	 dev_eval_scores = {'perplexity': 96754087231488.0}
2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO -   train_loss = 34.87022018432617
2020-06-11 20:39:28,627 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 20:46:40,538 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=800
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -11848.5537109375
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO -   Steps = 800/278576
2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO -   dev_loss = 9.379961	||	 dev_eval_scores = {'perplexity': 11848.5537109375}
2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO -   train_loss = 22.25151824951172
2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 20:53:56,085 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=1200
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -82.30923461914062
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Steps = 1200/278576
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO -   dev_loss = 4.410483	||	 dev_eval_scores = {'perplexity': 82.30923461914062}
2020-06-11 20:53:59,829 - crisis_transformers.trainer - INFO -   train_loss = 15.844202995300293
2020-06-11 20:53:59,829 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -82.30923461914062
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO -   Steps = 1600/278576
2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO -   dev_loss = 4.528600	||	 dev_eval_scores = {'perplexity': 92.62876892089844}
2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO -   train_loss = 12.387160301208496
2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:08:22,662 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:08:26,396 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=2000
2020-06-11 21:08:26,396 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -13.568625450134277
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Steps = 2000/278576
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   dev_loss = 2.607760	||	 dev_eval_scores = {'perplexity': 13.568625450134277}
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO -   train_loss = 10.256034851074219
2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:15:38,560 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=2400
2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -8.842060089111328
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Steps = 2400/278576
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   dev_loss = 2.179520	||	 dev_eval_scores = {'perplexity': 8.842060089111328}
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO -   train_loss = 8.810672760009766
2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:22:54,836 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=2800
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -6.2656636238098145
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Steps = 2800/278576
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   dev_loss = 1.835085	||	 dev_eval_scores = {'perplexity': 6.2656636238098145}
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO -   train_loss = 7.770569324493408
2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:30:11,274 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:30:11,274 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -6.2656636238098145
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Steps = 3200/278576
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   dev_loss = 1.864289	||	 dev_eval_scores = {'perplexity': 6.451348781585693}
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO -   train_loss = 6.985172748565674
2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:37:22,915 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=3600
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -4.507174015045166
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Steps = 3600/278576
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO -   dev_loss = 1.505670	||	 dev_eval_scores = {'perplexity': 4.507174015045166}
2020-06-11 21:37:27,051 - crisis_transformers.trainer - INFO -   train_loss = 6.368422985076904
2020-06-11 21:37:27,051 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:44:38,761 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=4000
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -4.046299457550049
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Steps = 4000/278576
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   dev_loss = 1.397803	||	 dev_eval_scores = {'perplexity': 4.046299457550049}
2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO -   train_loss = 5.87202262878418
2020-06-11 21:44:42,510 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:51:54,477 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:51:58,342 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=4400
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.7213120460510254
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Steps = 4400/278576
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   dev_loss = 1.314076	||	 dev_eval_scores = {'perplexity': 3.7213120460510254}
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO -   train_loss = 5.462299346923828
2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 21:59:10,274 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=4800
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.609790325164795
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Steps = 4800/278576
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO -   dev_loss = 1.283650	||	 dev_eval_scores = {'perplexity': 3.609790325164795}
2020-06-11 21:59:14,546 - crisis_transformers.trainer - INFO -   train_loss = 5.118622303009033
2020-06-11 21:59:14,546 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:06:26,200 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=5200
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.5901994705200195
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Steps = 5200/278576
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO -   dev_loss = 1.278208	||	 dev_eval_scores = {'perplexity': 3.5901994705200195}
2020-06-11 22:06:29,930 - crisis_transformers.trainer - INFO -   train_loss = 4.828340530395508
2020-06-11 22:06:29,930 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:13:41,909 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=5600
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.394659996032715
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Steps = 5600/278576
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO -   dev_loss = 1.222204	||	 dev_eval_scores = {'perplexity': 3.394659996032715}
2020-06-11 22:13:46,202 - crisis_transformers.trainer - INFO -   train_loss = 4.5767903327941895
2020-06-11 22:13:46,202 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.394659996032715
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO -   Steps = 6000/278576
2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO -   dev_loss = 1.227896	||	 dev_eval_scores = {'perplexity': 3.41403865814209}
2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO -   train_loss = 4.356290340423584
2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:28:09,137 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:28:13,001 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=6400
2020-06-11 22:28:13,001 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.263387680053711
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Steps = 6400/278576
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   dev_loss = 1.182766	||	 dev_eval_scores = {'perplexity': 3.263387680053711}
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO -   train_loss = 4.162957668304443
2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.263387680053711
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO -   Steps = 6800/278576
2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO -   dev_loss = 1.188739	||	 dev_eval_scores = {'perplexity': 3.2829389572143555}
2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO -   train_loss = 3.991642713546753
2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:42:37,188 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=7200
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.1454687118530273
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO -   Steps = 7200/278576
2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO -   dev_loss = 1.145963	||	 dev_eval_scores = {'perplexity': 3.1454687118530273}
2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO -   train_loss = 3.8384640216827393
2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:49:53,052 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=7600
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.0853919982910156
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Steps = 7600/278576
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO -   dev_loss = 1.126679	||	 dev_eval_scores = {'perplexity': 3.0853919982910156}
2020-06-11 22:49:56,892 - crisis_transformers.trainer - INFO -   train_loss = 3.701063871383667
2020-06-11 22:49:56,892 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 22:57:08,463 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:57:12,524 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=8000
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -3.04101824760437
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Steps = 8000/278576
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   dev_loss = 1.112192	||	 dev_eval_scores = {'perplexity': 3.04101824760437}
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO -   train_loss = 3.575878143310547
2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:04:23,771 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=8400
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.996488571166992
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Steps = 8400/278576
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO -   dev_loss = 1.097441	||	 dev_eval_scores = {'perplexity': 2.996488571166992}
2020-06-11 23:04:27,584 - crisis_transformers.trainer - INFO -   train_loss = 3.4622840881347656
2020-06-11 23:04:27,584 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:11:39,153 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:11:43,019 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=8800
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.9609262943267822
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Steps = 8800/278576
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   dev_loss = 1.085502	||	 dev_eval_scores = {'perplexity': 2.9609262943267822}
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO -   train_loss = 3.3580634593963623
2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:18:54,591 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=9200
2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.9230592250823975
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Steps = 9200/278576
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   dev_loss = 1.072631	||	 dev_eval_scores = {'perplexity': 2.9230592250823975}
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO -   train_loss = 3.2615966796875
2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:26:10,441 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=9600
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.886868715286255
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO -   Steps = 9600/278576
2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO -   dev_loss = 1.060172	||	 dev_eval_scores = {'perplexity': 2.886868715286255}
2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO -   train_loss = 3.1728854179382324
2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:33:25,705 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=10000
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.836120128631592
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO -   Steps = 10000/278576
2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO -   dev_loss = 1.042437	||	 dev_eval_scores = {'perplexity': 2.836120128631592}
2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO -   train_loss = 3.091641664505005
2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:40:41,837 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=10400
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.8059732913970947
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO -   Steps = 10400/278576
2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO -   dev_loss = 1.031750	||	 dev_eval_scores = {'perplexity': 2.8059732913970947}
2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO -   train_loss = 3.0152587890625
2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:47:57,612 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=10800
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.772104263305664
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Steps = 10800/278576
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO -   dev_loss = 1.019607	||	 dev_eval_scores = {'perplexity': 2.772104263305664}
2020-06-11 23:48:02,019 - crisis_transformers.trainer - INFO -   train_loss = 2.9444949626922607
2020-06-11 23:48:02,019 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-11 23:55:13,323 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=11200
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.7492218017578125
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Steps = 11200/278576
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   dev_loss = 1.011318	||	 dev_eval_scores = {'perplexity': 2.7492218017578125}
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO -   train_loss = 2.8781516551971436
2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:02:29,325 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=11600
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.7139976024627686
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO -   Steps = 11600/278576
2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO -   dev_loss = 0.998423	||	 dev_eval_scores = {'perplexity': 2.7139976024627686}
2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO -   train_loss = 2.816199541091919
2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:09:45,484 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=12000
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.68788480758667
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Steps = 12000/278576
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO -   dev_loss = 0.988755	||	 dev_eval_scores = {'perplexity': 2.68788480758667}
2020-06-12 00:09:49,445 - crisis_transformers.trainer - INFO -   train_loss = 2.7574315071105957
2020-06-12 00:09:49,445 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:17:00,535 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=12400
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.654526710510254
2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   Steps = 12400/278576
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   dev_loss = 0.976266	||	 dev_eval_scores = {'perplexity': 2.654526710510254}
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO -   train_loss = 2.7026596069335938
2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:24:16,075 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=12800
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.6282081604003906
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO -   Steps = 12800/278576
2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO -   dev_loss = 0.966302	||	 dev_eval_scores = {'perplexity': 2.6282081604003906}
2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO -   train_loss = 2.650250196456909
2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:31:31,505 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=13200
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.6085095405578613
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Steps = 13200/278576
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   dev_loss = 0.958779	||	 dev_eval_scores = {'perplexity': 2.6085095405578613}
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO -   train_loss = 2.600867986679077
2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:38:46,576 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:38:50,406 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=13600
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.572343587875366
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Steps = 13600/278576
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   dev_loss = 0.944817	||	 dev_eval_scores = {'perplexity': 2.572343587875366}
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO -   train_loss = 2.55434513092041
2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:46:01,097 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=14000
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.5420730113983154
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Steps = 14000/278576
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   dev_loss = 0.932980	||	 dev_eval_scores = {'perplexity': 2.5420730113983154}
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO -   train_loss = 2.509788751602173
2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 00:53:16,943 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=14400
2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.508371114730835
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Steps = 14400/278576
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   dev_loss = 0.919634	||	 dev_eval_scores = {'perplexity': 2.508371114730835}
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO -   train_loss = 2.4678986072540283
2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:00:31,540 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=14800
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.4876623153686523
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Steps = 14800/278576
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   dev_loss = 0.911343	||	 dev_eval_scores = {'perplexity': 2.4876623153686523}
2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO -   train_loss = 2.428267002105713
2020-06-12 01:00:35,780 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:07:47,187 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=15200
2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.46309757232666
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Steps = 15200/278576
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   dev_loss = 0.901420	||	 dev_eval_scores = {'perplexity': 2.46309757232666}
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO -   train_loss = 2.3902571201324463
2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:15:02,762 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=15600
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.4311938285827637
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO -   Steps = 15600/278576
2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO -   dev_loss = 0.888382	||	 dev_eval_scores = {'perplexity': 2.4311938285827637}
2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO -   train_loss = 2.354424238204956
2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:22:18,392 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=16000
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.4099924564361572
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO -   Steps = 16000/278576
2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO -   dev_loss = 0.879624	||	 dev_eval_scores = {'perplexity': 2.4099924564361572}
2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO -   train_loss = 2.319091796875
2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:29:33,829 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:29:37,786 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=16400
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.3866090774536133
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Steps = 16400/278576
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   dev_loss = 0.869874	||	 dev_eval_scores = {'perplexity': 2.3866090774536133}
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO -   train_loss = 2.285768747329712
2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:36:48,893 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=16800
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.3575546741485596
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Steps = 16800/278576
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   dev_loss = 0.857625	||	 dev_eval_scores = {'perplexity': 2.3575546741485596}
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO -   train_loss = 2.253655433654785
2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:44:04,158 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Save check-point at epoch=0 step=17200
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.3273518085479736
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Epoch = 1/16
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Steps = 17200/278576
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO -   dev_loss = 0.844731	||	 dev_eval_scores = {'perplexity': 2.3273518085479736}
2020-06-12 01:44:08,001 - crisis_transformers.trainer - INFO -   train_loss = 2.22310733795166
2020-06-12 01:44:08,001 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:46:08,626 - crisis_transformers.trainer - INFO - epoch 1 ends, 15 epoches left
2020-06-12 01:46:08,628 - crisis_transformers.trainer - INFO - 
global_average_loss=2.207577705383301,global_steps=17411 on training set
2020-06-12 01:51:18,826 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=189
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.306220054626465
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Steps = 17600/278576
2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO -   dev_loss = 0.835610	||	 dev_eval_scores = {'perplexity': 2.306220054626465}
2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO -   train_loss = 0.8919339776039124
2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 01:58:34,239 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=589
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.282672166824341
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -   Steps = 18000/278576
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -   dev_loss = 0.825347	||	 dev_eval_scores = {'perplexity': 2.282672166824341}
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO -   train_loss = 0.9102271199226379
2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:05:49,927 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:05:53,796 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=989
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.261589288711548
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Steps = 18400/278576
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   dev_loss = 0.816068	||	 dev_eval_scores = {'perplexity': 2.261589288711548}
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO -   train_loss = 0.9082818031311035
2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:13:04,838 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=1389
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.2358696460723877
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Steps = 18800/278576
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   dev_loss = 0.804630	||	 dev_eval_scores = {'perplexity': 2.2358696460723877}
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO -   train_loss = 0.9062302708625793
2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:20:19,901 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=1789
2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.210675001144409
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   Steps = 19200/278576
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   dev_loss = 0.793298	||	 dev_eval_scores = {'perplexity': 2.210675001144409}
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO -   train_loss = 0.8982809782028198
2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:27:35,383 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=2189
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.1900641918182373
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Steps = 19600/278576
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO -   dev_loss = 0.783931	||	 dev_eval_scores = {'perplexity': 2.1900641918182373}
2020-06-12 02:27:39,230 - crisis_transformers.trainer - INFO -   train_loss = 0.8898405432701111
2020-06-12 02:27:39,230 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:34:50,458 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=2589
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.169043779373169
2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   Steps = 20000/278576
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   dev_loss = 0.774286	||	 dev_eval_scores = {'perplexity': 2.169043779373169}
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO -   train_loss = 0.8851494789123535
2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:42:06,145 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=2989
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.140289783477783
2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   Steps = 20400/278576
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   dev_loss = 0.760941	||	 dev_eval_scores = {'perplexity': 2.140289783477783}
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO -   train_loss = 0.8793467283248901
2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:49:21,789 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=3389
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.121040105819702
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Steps = 20800/278576
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO -   dev_loss = 0.751907	||	 dev_eval_scores = {'perplexity': 2.121040105819702}
2020-06-12 02:49:25,275 - crisis_transformers.trainer - INFO -   train_loss = 0.871684730052948
2020-06-12 02:49:25,275 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 02:56:37,107 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:56:41,332 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=3789
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.0992023944854736
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Steps = 21200/278576
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   dev_loss = 0.741557	||	 dev_eval_scores = {'perplexity': 2.0992023944854736}
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO -   train_loss = 0.8670705556869507
2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:03:52,858 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=4189
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.0718014240264893
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -   Steps = 21600/278576
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -   dev_loss = 0.728418	||	 dev_eval_scores = {'perplexity': 2.0718014240264893}
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO -   train_loss = 0.8624909520149231
2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:11:08,681 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=4589
2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.0530827045440674
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Steps = 22000/278576
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   dev_loss = 0.719342	||	 dev_eval_scores = {'perplexity': 2.0530827045440674}
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO -   train_loss = 0.8574584126472473
2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:18:23,725 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:18:27,499 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=4989
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.0336155891418457
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Steps = 22400/278576
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   dev_loss = 0.709815	||	 dev_eval_scores = {'perplexity': 2.0336155891418457}
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO -   train_loss = 0.8504697680473328
2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:25:39,057 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=5389
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -2.0098047256469727
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Steps = 22800/278576
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   dev_loss = 0.698038	||	 dev_eval_scores = {'perplexity': 2.0098047256469727}
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO -   train_loss = 0.8460281491279602
2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:32:54,836 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:32:58,726 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=5789
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.9868273735046387
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Steps = 23200/278576
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   dev_loss = 0.686539	||	 dev_eval_scores = {'perplexity': 1.9868273735046387}
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO -   train_loss = 0.8407670855522156
2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:40:09,514 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=6189
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.9708765745162964
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Steps = 23600/278576
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO -   dev_loss = 0.678478	||	 dev_eval_scores = {'perplexity': 1.9708765745162964}
2020-06-12 03:40:13,426 - crisis_transformers.trainer - INFO -   train_loss = 0.8364319205284119
2020-06-12 03:40:13,426 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:47:25,312 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=6589
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.950257420539856
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Steps = 24000/278576
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO -   dev_loss = 0.667961	||	 dev_eval_scores = {'perplexity': 1.950257420539856}
2020-06-12 03:47:28,666 - crisis_transformers.trainer - INFO -   train_loss = 0.8307074308395386
2020-06-12 03:47:28,666 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 03:54:40,261 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:54:44,220 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=6989
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.9251375198364258
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Steps = 24400/278576
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   dev_loss = 0.654997	||	 dev_eval_scores = {'perplexity': 1.9251375198364258}
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO -   train_loss = 0.8253822922706604
2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:01:55,402 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:01:59,672 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=7389
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.9091284275054932
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Steps = 24800/278576
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   dev_loss = 0.646647	||	 dev_eval_scores = {'perplexity': 1.9091284275054932}
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO -   train_loss = 0.8205176591873169
2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:09:10,549 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=7789
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.8884336948394775
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Steps = 25200/278576
2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO -   dev_loss = 0.635748	||	 dev_eval_scores = {'perplexity': 1.8884336948394775}
2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO -   train_loss = 0.8162734508514404
2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:16:26,145 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=8189
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.8698856830596924
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO -   Steps = 25600/278576
2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO -   dev_loss = 0.625877	||	 dev_eval_scores = {'perplexity': 1.8698856830596924}
2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO -   train_loss = 0.811458170413971
2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:23:41,295 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=8589
2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.8487507104873657
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Steps = 26000/278576
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   dev_loss = 0.614510	||	 dev_eval_scores = {'perplexity': 1.8487507104873657}
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO -   train_loss = 0.8066449761390686
2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:30:56,930 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=8989
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.8316272497177124
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Steps = 26400/278576
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO -   dev_loss = 0.605205	||	 dev_eval_scores = {'perplexity': 1.8316272497177124}
2020-06-12 04:31:01,240 - crisis_transformers.trainer - INFO -   train_loss = 0.802518904209137
2020-06-12 04:31:01,240 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:38:12,636 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=9389
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.8143984079360962
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Steps = 26800/278576
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO -   dev_loss = 0.595754	||	 dev_eval_scores = {'perplexity': 1.8143984079360962}
2020-06-12 04:38:16,326 - crisis_transformers.trainer - INFO -   train_loss = 0.7970281839370728
2020-06-12 04:38:16,326 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:45:27,487 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=9789
2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.7961244583129883
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Steps = 27200/278576
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   dev_loss = 0.585631	||	 dev_eval_scores = {'perplexity': 1.7961244583129883}
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO -   train_loss = 0.7928464412689209
2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:52:42,555 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=10189
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.7843670845031738
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO -   Steps = 27600/278576
2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO -   dev_loss = 0.579064	||	 dev_eval_scores = {'perplexity': 1.7843670845031738}
2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO -   train_loss = 0.7886567711830139
2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 04:59:57,976 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=10589
2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.764768123626709
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Steps = 28000/278576
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   dev_loss = 0.568019	||	 dev_eval_scores = {'perplexity': 1.764768123626709}
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO -   train_loss = 0.7840659618377686
2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:07:12,938 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=10989
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.7483080625534058
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Steps = 28400/278576
2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO -   dev_loss = 0.558648	||	 dev_eval_scores = {'perplexity': 1.7483080625534058}
2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO -   train_loss = 0.7798082828521729
2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:14:27,901 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=11389
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.7391985654830933
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Steps = 28800/278576
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO -   dev_loss = 0.553424	||	 dev_eval_scores = {'perplexity': 1.7391985654830933}
2020-06-12 05:14:32,043 - crisis_transformers.trainer - INFO -   train_loss = 0.7753340601921082
2020-06-12 05:14:32,043 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:21:43,000 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=11789
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.7198398113250732
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Steps = 29200/278576
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO -   dev_loss = 0.542231	||	 dev_eval_scores = {'perplexity': 1.7198398113250732}
2020-06-12 05:21:46,791 - crisis_transformers.trainer - INFO -   train_loss = 0.7709231972694397
2020-06-12 05:21:46,791 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:28:58,330 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:29:02,113 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=12189
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.6953392028808594
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Steps = 29600/278576
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   dev_loss = 0.527883	||	 dev_eval_scores = {'perplexity': 1.6953392028808594}
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO -   train_loss = 0.7663020491600037
2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:36:13,699 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=12589
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.684487223625183
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Steps = 30000/278576
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   dev_loss = 0.521461	||	 dev_eval_scores = {'perplexity': 1.684487223625183}
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO -   train_loss = 0.7620025277137756
2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:43:29,168 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=12989
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.6656123399734497
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO -   Steps = 30400/278576
2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO -   dev_loss = 0.510193	||	 dev_eval_scores = {'perplexity': 1.6656123399734497}
2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO -   train_loss = 0.7573661208152771
2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:50:44,231 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=13389
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.651535987854004
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Steps = 30800/278576
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO -   dev_loss = 0.501706	||	 dev_eval_scores = {'perplexity': 1.651535987854004}
2020-06-12 05:50:48,565 - crisis_transformers.trainer - INFO -   train_loss = 0.7526668906211853
2020-06-12 05:50:48,565 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 05:57:59,864 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:58:03,697 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=13789
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.6361221075057983
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Steps = 31200/278576
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   dev_loss = 0.492329	||	 dev_eval_scores = {'perplexity': 1.6361221075057983}
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO -   train_loss = 0.7481159567832947
2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:05:14,807 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:05:18,556 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=14189
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.6220632791519165
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Steps = 31600/278576
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   dev_loss = 0.483699	||	 dev_eval_scores = {'perplexity': 1.6220632791519165}
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO -   train_loss = 0.7438127994537354
2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:12:30,611 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=14589
2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.610316276550293
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   Steps = 32000/278576
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   dev_loss = 0.476431	||	 dev_eval_scores = {'perplexity': 1.610316276550293}
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO -   train_loss = 0.7390771508216858
2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:19:45,562 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=14989
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.5992158651351929
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Steps = 32400/278576
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO -   dev_loss = 0.469513	||	 dev_eval_scores = {'perplexity': 1.5992158651351929}
2020-06-12 06:19:49,381 - crisis_transformers.trainer - INFO -   train_loss = 0.7345605492591858
2020-06-12 06:19:49,381 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:27:00,549 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=15389
2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.581007480621338
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Steps = 32800/278576
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   dev_loss = 0.458062	||	 dev_eval_scores = {'perplexity': 1.581007480621338}
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO -   train_loss = 0.7301263809204102
2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:34:16,032 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=15789
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.571539044380188
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Steps = 33200/278576
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO -   dev_loss = 0.452055	||	 dev_eval_scores = {'perplexity': 1.571539044380188}
2020-06-12 06:34:19,880 - crisis_transformers.trainer - INFO -   train_loss = 0.7254016399383545
2020-06-12 06:34:19,880 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:41:30,980 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:41:34,818 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=16189
2020-06-12 06:41:34,818 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.5561637878417969
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Steps = 33600/278576
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   dev_loss = 0.442224	||	 dev_eval_scores = {'perplexity': 1.5561637878417969}
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO -   train_loss = 0.7213504314422607
2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:48:46,476 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=16589
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.5447299480438232
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -   Steps = 34000/278576
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -   dev_loss = 0.434849	||	 dev_eval_scores = {'perplexity': 1.5447299480438232}
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO -   train_loss = 0.7172350883483887
2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 06:56:02,163 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=16989
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.5317991971969604
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Steps = 34400/278576
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   dev_loss = 0.426443	||	 dev_eval_scores = {'perplexity': 1.5317991971969604}
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO -   train_loss = 0.7126625180244446
2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:03:17,948 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Save check-point at epoch=1 step=17389
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.5181857347488403
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Epoch = 2/16
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Steps = 34800/278576
2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO -   dev_loss = 0.417516	||	 dev_eval_scores = {'perplexity': 1.5181857347488403}
2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO -   train_loss = 0.7084499001502991
2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:03:34,765 - crisis_transformers.trainer - INFO - epoch 2 ends, 14 epoches left
2020-06-12 07:03:34,767 - crisis_transformers.trainer - INFO - 
global_average_loss=1.457897663116455,global_steps=34822 on training set
2020-06-12 07:10:33,578 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=378
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.5065593719482422
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Steps = 35200/278576
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO -   dev_loss = 0.409828	||	 dev_eval_scores = {'perplexity': 1.5065593719482422}
2020-06-12 07:10:37,320 - crisis_transformers.trainer - INFO -   train_loss = 0.5077053904533386
2020-06-12 07:10:37,320 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:17:48,578 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:17:52,866 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=778
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4995406866073608
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Steps = 35600/278576
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO -   dev_loss = 0.405159	||	 dev_eval_scores = {'perplexity': 1.4995406866073608}
2020-06-12 07:17:52,868 - crisis_transformers.trainer - INFO -   train_loss = 0.5084229111671448
2020-06-12 07:17:52,868 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:25:04,279 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=1178
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4872630834579468
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Steps = 36000/278576
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   dev_loss = 0.396938	||	 dev_eval_scores = {'perplexity': 1.4872630834579468}
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO -   train_loss = 0.5008477568626404
2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:32:19,554 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=1578
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4724860191345215
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Steps = 36400/278576
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   dev_loss = 0.386952	||	 dev_eval_scores = {'perplexity': 1.4724860191345215}
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO -   train_loss = 0.49597886204719543
2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:39:35,241 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=1978
2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4704307317733765
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   Steps = 36800/278576
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   dev_loss = 0.385555	||	 dev_eval_scores = {'perplexity': 1.4704307317733765}
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO -   train_loss = 0.4925973117351532
2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:46:50,212 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:46:54,091 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=2378
2020-06-12 07:46:54,091 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4530060291290283
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Steps = 37200/278576
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   dev_loss = 0.373635	||	 dev_eval_scores = {'perplexity': 1.4530060291290283}
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO -   train_loss = 0.4874938130378723
2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 07:54:05,635 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=2778
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4396711587905884
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Steps = 37600/278576
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO -   dev_loss = 0.364415	||	 dev_eval_scores = {'perplexity': 1.4396711587905884}
2020-06-12 07:54:09,865 - crisis_transformers.trainer - INFO -   train_loss = 0.4848006069660187
2020-06-12 07:54:09,865 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:01:21,053 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=3178
2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4313539266586304
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Steps = 38000/278576
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   dev_loss = 0.358621	||	 dev_eval_scores = {'perplexity': 1.4313539266586304}
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO -   train_loss = 0.4819473922252655
2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:08:36,819 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=3578
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4217482805252075
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Steps = 38400/278576
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO -   dev_loss = 0.351887	||	 dev_eval_scores = {'perplexity': 1.4217482805252075}
2020-06-12 08:08:40,630 - crisis_transformers.trainer - INFO -   train_loss = 0.47801968455314636
2020-06-12 08:08:40,630 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:15:51,675 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=3978
2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4130357503890991
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Steps = 38800/278576
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   dev_loss = 0.345740	||	 dev_eval_scores = {'perplexity': 1.4130357503890991}
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO -   train_loss = 0.4750809073448181
2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:23:07,471 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=4378
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.4030141830444336
2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   Steps = 39200/278576
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   dev_loss = 0.338623	||	 dev_eval_scores = {'perplexity': 1.4030141830444336}
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO -   train_loss = 0.4713905453681946
2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:30:22,512 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=4778
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.391721487045288
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO -   Steps = 39600/278576
2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO -   dev_loss = 0.330542	||	 dev_eval_scores = {'perplexity': 1.391721487045288}
2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO -   train_loss = 0.4688350260257721
2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:37:37,836 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=5178
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3863202333450317
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Steps = 40000/278576
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO -   dev_loss = 0.326653	||	 dev_eval_scores = {'perplexity': 1.3863202333450317}
2020-06-12 08:37:41,642 - crisis_transformers.trainer - INFO -   train_loss = 0.4653063714504242
2020-06-12 08:37:41,642 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:44:52,891 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:44:56,707 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=5578
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3764472007751465
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Steps = 40400/278576
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   dev_loss = 0.319506	||	 dev_eval_scores = {'perplexity': 1.3764472007751465}
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO -   train_loss = 0.4616211950778961
2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:52:07,568 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=5978
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3701869249343872
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Steps = 40800/278576
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   dev_loss = 0.314947	||	 dev_eval_scores = {'perplexity': 1.3701869249343872}
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO -   train_loss = 0.4588777422904968
2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 08:59:22,571 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=6378
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3665746450424194
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Steps = 41200/278576
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   dev_loss = 0.312307	||	 dev_eval_scores = {'perplexity': 1.3665746450424194}
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO -   train_loss = 0.4553696811199188
2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:06:37,730 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=6778
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.35618257522583
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO -   Steps = 41600/278576
2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO -   dev_loss = 0.304674	||	 dev_eval_scores = {'perplexity': 1.35618257522583}
2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO -   train_loss = 0.4522298574447632
2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:13:52,744 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:13:56,458 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=7178
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.347740888595581
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Steps = 42000/278576
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   dev_loss = 0.298430	||	 dev_eval_scores = {'perplexity': 1.347740888595581}
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO -   train_loss = 0.4496159553527832
2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:21:07,688 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=7578
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3386039733886719
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Steps = 42400/278576
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO -   dev_loss = 0.291627	||	 dev_eval_scores = {'perplexity': 1.3386039733886719}
2020-06-12 09:21:11,510 - crisis_transformers.trainer - INFO -   train_loss = 0.44657158851623535
2020-06-12 09:21:11,510 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:28:22,370 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=7978
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3325072526931763
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Steps = 42800/278576
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   dev_loss = 0.287062	||	 dev_eval_scores = {'perplexity': 1.3325072526931763}
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO -   train_loss = 0.4432190954685211
2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:35:37,349 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=8378
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3274871110916138
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Steps = 43200/278576
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO -   dev_loss = 0.283288	||	 dev_eval_scores = {'perplexity': 1.3274871110916138}
2020-06-12 09:35:41,211 - crisis_transformers.trainer - INFO -   train_loss = 0.4402396082878113
2020-06-12 09:35:41,211 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:42:52,444 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=8778
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.315727710723877
2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   Steps = 43600/278576
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   dev_loss = 0.274390	||	 dev_eval_scores = {'perplexity': 1.315727710723877}
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO -   train_loss = 0.43725401163101196
2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:50:07,902 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=9178
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3152892589569092
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Steps = 44000/278576
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   dev_loss = 0.274057	||	 dev_eval_scores = {'perplexity': 1.3152892589569092}
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO -   train_loss = 0.43421682715415955
2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 09:57:23,789 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=9578
2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.306022047996521
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Steps = 44400/278576
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   dev_loss = 0.266986	||	 dev_eval_scores = {'perplexity': 1.306022047996521}
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO -   train_loss = 0.43124547600746155
2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:04:38,908 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:04:42,860 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=9978
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.3020490407943726
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Steps = 44800/278576
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   dev_loss = 0.263939	||	 dev_eval_scores = {'perplexity': 1.3020490407943726}
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO -   train_loss = 0.42827215790748596
2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:11:53,676 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=10378
2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2914035320281982
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Steps = 45200/278576
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   dev_loss = 0.255730	||	 dev_eval_scores = {'perplexity': 1.2914035320281982}
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO -   train_loss = 0.4252185821533203
2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:19:08,790 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=10778
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2862516641616821
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO -   Steps = 45600/278576
2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO -   dev_loss = 0.251732	||	 dev_eval_scores = {'perplexity': 1.2862516641616821}
2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO -   train_loss = 0.4227924942970276
2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:26:22,943 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=11178
2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.282753825187683
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Steps = 46000/278576
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   dev_loss = 0.249009	||	 dev_eval_scores = {'perplexity': 1.282753825187683}
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO -   train_loss = 0.41981765627861023
2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:33:37,608 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:33:41,331 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=11578
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2747212648391724
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Steps = 46400/278576
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   dev_loss = 0.242728	||	 dev_eval_scores = {'perplexity': 1.2747212648391724}
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO -   train_loss = 0.41709834337234497
2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:40:52,557 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=11978
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.270970106124878
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Steps = 46800/278576
2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO -   dev_loss = 0.239781	||	 dev_eval_scores = {'perplexity': 1.270970106124878}
2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO -   train_loss = 0.41438814997673035
2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:48:07,184 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=12378
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2681171894073486
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Steps = 47200/278576
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO -   dev_loss = 0.237533	||	 dev_eval_scores = {'perplexity': 1.2681171894073486}
2020-06-12 10:48:11,366 - crisis_transformers.trainer - INFO -   train_loss = 0.4116062819957733
2020-06-12 10:48:11,366 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 10:55:21,839 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 10:55:21,839 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2681171894073486
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 10s
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Steps = 47600/278576
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   dev_loss = 0.237603	||	 dev_eval_scores = {'perplexity': 1.268206000328064}
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO -   train_loss = 0.4090476334095001
2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:02:33,440 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=13178
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2570724487304688
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Steps = 48000/278576
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   dev_loss = 0.228786	||	 dev_eval_scores = {'perplexity': 1.2570724487304688}
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO -   train_loss = 0.40632161498069763
2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:09:48,383 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:09:51,983 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=13578
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.252106785774231
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Steps = 48400/278576
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   dev_loss = 0.224828	||	 dev_eval_scores = {'perplexity': 1.252106785774231}
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO -   train_loss = 0.40354153513908386
2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:17:03,692 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=13978
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2468602657318115
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO -   Steps = 48800/278576
2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO -   dev_loss = 0.220629	||	 dev_eval_scores = {'perplexity': 1.2468602657318115}
2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO -   train_loss = 0.40087953209877014
2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:24:17,752 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:24:21,750 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=14378
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.241665244102478
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Steps = 49200/278576
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   dev_loss = 0.216453	||	 dev_eval_scores = {'perplexity': 1.241665244102478}
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO -   train_loss = 0.3984794318675995
2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:31:32,613 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:31:36,096 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=14778
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2373579740524292
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Steps = 49600/278576
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   dev_loss = 0.212978	||	 dev_eval_scores = {'perplexity': 1.2373579740524292}
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO -   train_loss = 0.3960722088813782
2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:38:48,257 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=15178
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2339407205581665
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Steps = 50000/278576
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   dev_loss = 0.210213	||	 dev_eval_scores = {'perplexity': 1.2339407205581665}
2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO -   train_loss = 0.39352700114250183
2020-06-12 11:38:51,724 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:46:03,817 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=15578
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2312901020050049
2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   Steps = 50400/278576
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   dev_loss = 0.208062	||	 dev_eval_scores = {'perplexity': 1.2312901020050049}
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO -   train_loss = 0.39096903800964355
2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 11:53:18,503 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:53:21,880 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=15978
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2268590927124023
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Steps = 50800/278576
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO -   dev_loss = 0.204457	||	 dev_eval_scores = {'perplexity': 1.2268590927124023}
2020-06-12 11:53:21,882 - crisis_transformers.trainer - INFO -   train_loss = 0.3884601294994354
2020-06-12 11:53:21,882 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:00:33,247 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:00:36,497 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=16378
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.224045991897583
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Steps = 51200/278576
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   dev_loss = 0.202162	||	 dev_eval_scores = {'perplexity': 1.224045991897583}
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO -   train_loss = 0.38594967126846313
2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:07:47,479 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:07:51,260 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=16778
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.21811842918396
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Steps = 51600/278576
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   dev_loss = 0.197307	||	 dev_eval_scores = {'perplexity': 1.21811842918396}
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO -   train_loss = 0.3834236264228821
2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:15:01,910 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:15:05,101 - crisis_transformers.trainer - INFO -   Save check-point at epoch=2 step=17178
2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2158831357955933
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 13s
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Epoch = 3/16
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Steps = 52000/278576
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   dev_loss = 0.195471	||	 dev_eval_scores = {'perplexity': 1.2158831357955933}
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO -   train_loss = 0.3809737265110016
2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:17:19,046 - crisis_transformers.trainer - INFO - epoch 3 ends, 13 epoches left
2020-06-12 12:17:19,049 - crisis_transformers.trainer - INFO - 
global_average_loss=1.0984210968017578,global_steps=52233 on training set
2020-06-12 12:22:16,490 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=167
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2130458354949951
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Steps = 52400/278576
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO -   dev_loss = 0.193134	||	 dev_eval_scores = {'perplexity': 1.2130458354949951}
2020-06-12 12:22:19,855 - crisis_transformers.trainer - INFO -   train_loss = 0.25673729181289673
2020-06-12 12:22:19,855 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:29:30,907 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=567
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.210909128189087
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO -   Steps = 52800/278576
2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO -   dev_loss = 0.191371	||	 dev_eval_scores = {'perplexity': 1.210909128189087}
2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO -   train_loss = 0.25955724716186523
2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:36:45,199 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=967
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2072252035140991
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO -   Steps = 53200/278576
2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO -   dev_loss = 0.188325	||	 dev_eval_scores = {'perplexity': 1.2072252035140991}
2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO -   train_loss = 0.2595442533493042
2020-06-12 12:36:48,920 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:43:59,962 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:44:03,265 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=1367
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2031614780426025
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO -   Steps = 53600/278576
2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:44:03,282 - crisis_transformers.trainer - INFO -   dev_loss = 0.184953	||	 dev_eval_scores = {'perplexity': 1.2031614780426025}
2020-06-12 12:44:03,297 - crisis_transformers.trainer - INFO -   train_loss = 0.25946447253227234
2020-06-12 12:44:03,297 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:51:15,360 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:51:18,633 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=1767
2020-06-12 12:51:18,648 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:51:18,648 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2021421194076538
2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO -   Steps = 54000/278576
2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO -   dev_loss = 0.184105	||	 dev_eval_scores = {'perplexity': 1.2021421194076538}
2020-06-12 12:51:18,669 - crisis_transformers.trainer - INFO -   train_loss = 0.2583455443382263
2020-06-12 12:51:18,669 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2021421194076538
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Steps = 54400/278576
2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO -   dev_loss = 0.184208	||	 dev_eval_scores = {'perplexity': 1.2022662162780762}
2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO -   train_loss = 0.25629064440727234
2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:05:42,132 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Early stop count = 2/20
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2021421194076538
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO -   Steps = 54800/278576
2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO -   dev_loss = 0.187595	||	 dev_eval_scores = {'perplexity': 1.2063448429107666}
2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO -   train_loss = 0.25469452142715454
2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:12:53,734 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO -   Early stop count = 3/20
2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.2021421194076538
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   Steps = 55200/278576
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   dev_loss = 0.184251	||	 dev_eval_scores = {'perplexity': 1.2023180723190308}
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO -   train_loss = 0.25338509678840637
2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:20:04,638 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:20:07,839 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=3367
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.191463589668274
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:20:07,855 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO -   Steps = 55600/278576
2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO -   dev_loss = 0.175182	||	 dev_eval_scores = {'perplexity': 1.191463589668274}
2020-06-12 13:20:07,876 - crisis_transformers.trainer - INFO -   train_loss = 0.2521595358848572
2020-06-12 13:20:07,876 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:27:19,652 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:27:22,821 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=3767
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1891900300979614
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO -   Steps = 56000/278576
2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO -   dev_loss = 0.173272	||	 dev_eval_scores = {'perplexity': 1.1891900300979614}
2020-06-12 13:27:22,856 - crisis_transformers.trainer - INFO -   train_loss = 0.251168429851532
2020-06-12 13:27:22,856 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1891900300979614
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Steps = 56400/278576
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   dev_loss = 0.174895	||	 dev_eval_scores = {'perplexity': 1.1911215782165527}
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO -   train_loss = 0.2495262175798416
2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:41:45,895 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:41:49,254 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=4567
2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1830016374588013
2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:41:49,271 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:41:49,271 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO -   Steps = 56800/278576
2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO -   dev_loss = 0.168055	||	 dev_eval_scores = {'perplexity': 1.1830016374588013}
2020-06-12 13:41:49,290 - crisis_transformers.trainer - INFO -   train_loss = 0.24820448458194733
2020-06-12 13:41:49,290 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:49:01,184 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:49:04,330 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=4967
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1823091506958008
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO -   Steps = 57200/278576
2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO -   dev_loss = 0.167469	||	 dev_eval_scores = {'perplexity': 1.1823091506958008}
2020-06-12 13:49:04,366 - crisis_transformers.trainer - INFO -   train_loss = 0.24676309525966644
2020-06-12 13:49:04,366 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1823091506958008
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Steps = 57600/278576
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   dev_loss = 0.168122	||	 dev_eval_scores = {'perplexity': 1.1830805540084839}
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO -   train_loss = 0.2456163614988327
2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:03:27,611 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Early stop count = 2/20
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1823091506958008
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO -   Steps = 58000/278576
2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO -   dev_loss = 0.168360	||	 dev_eval_scores = {'perplexity': 1.1833629608154297}
2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO -   train_loss = 0.24447211623191833
2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:10:39,369 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:10:42,563 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=6167
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1757748126983643
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO -   Steps = 58400/278576
2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO -   dev_loss = 0.161927	||	 dev_eval_scores = {'perplexity': 1.1757748126983643}
2020-06-12 14:10:42,599 - crisis_transformers.trainer - INFO -   train_loss = 0.24326415359973907
2020-06-12 14:10:42,599 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:17:54,473 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:17:57,675 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=6567
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1724531650543213
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO -   Steps = 58800/278576
2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO -   dev_loss = 0.159098	||	 dev_eval_scores = {'perplexity': 1.1724531650543213}
2020-06-12 14:17:57,710 - crisis_transformers.trainer - INFO -   train_loss = 0.24194355309009552
2020-06-12 14:17:57,710 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:25:09,648 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:25:13,392 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=6967
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1723699569702148
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO -   Steps = 59200/278576
2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO -   dev_loss = 0.159027	||	 dev_eval_scores = {'perplexity': 1.1723699569702148}
2020-06-12 14:25:13,426 - crisis_transformers.trainer - INFO -   train_loss = 0.24068011343479156
2020-06-12 14:25:13,426 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:32:25,725 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:32:28,988 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=7367
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1677402257919312
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO -   Steps = 59600/278576
2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO -   dev_loss = 0.155071	||	 dev_eval_scores = {'perplexity': 1.1677402257919312}
2020-06-12 14:32:29,024 - crisis_transformers.trainer - INFO -   train_loss = 0.23944209516048431
2020-06-12 14:32:29,024 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1677402257919312
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Steps = 60000/278576
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   dev_loss = 0.160010	||	 dev_eval_scores = {'perplexity': 1.1735223531723022}
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO -   train_loss = 0.23837868869304657
2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:46:52,988 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:46:55,997 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=8167
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1659613847732544
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:46:56,014 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO -   Steps = 60400/278576
2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO -   dev_loss = 0.153546	||	 dev_eval_scores = {'perplexity': 1.1659613847732544}
2020-06-12 14:46:56,033 - crisis_transformers.trainer - INFO -   train_loss = 0.23715026676654816
2020-06-12 14:46:56,033 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 14:54:07,918 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:54:10,976 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=8567
2020-06-12 14:54:10,989 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.165794014930725
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Steps = 60800/278576
2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 14:54:10,991 - crisis_transformers.trainer - INFO -   dev_loss = 0.153402	||	 dev_eval_scores = {'perplexity': 1.165794014930725}
2020-06-12 14:54:11,007 - crisis_transformers.trainer - INFO -   train_loss = 0.23593905568122864
2020-06-12 14:54:11,007 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:01:22,795 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:01:25,973 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=8967
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.161121129989624
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO -   Steps = 61200/278576
2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO -   dev_loss = 0.149386	||	 dev_eval_scores = {'perplexity': 1.161121129989624}
2020-06-12 15:01:26,009 - crisis_transformers.trainer - INFO -   train_loss = 0.2347162663936615
2020-06-12 15:01:26,009 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:08:37,941 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:08:41,095 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=9367
2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.159262776374817
2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO -   Steps = 61600/278576
2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO -   dev_loss = 0.147784	||	 dev_eval_scores = {'perplexity': 1.159262776374817}
2020-06-12 15:08:41,131 - crisis_transformers.trainer - INFO -   train_loss = 0.2334117591381073
2020-06-12 15:08:41,131 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:15:53,144 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.159262776374817
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Steps = 62000/278576
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   dev_loss = 0.149117	||	 dev_eval_scores = {'perplexity': 1.1608085632324219}
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO -   train_loss = 0.23231545090675354
2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:23:04,214 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:23:07,484 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=10167
2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1586897373199463
2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO -   Steps = 62400/278576
2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO -   dev_loss = 0.147290	||	 dev_eval_scores = {'perplexity': 1.1586897373199463}
2020-06-12 15:23:07,520 - crisis_transformers.trainer - INFO -   train_loss = 0.23095978796482086
2020-06-12 15:23:07,520 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:30:19,155 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:30:22,774 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=10567
2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.155853509902954
2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO -   Steps = 62800/278576
2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO -   dev_loss = 0.144839	||	 dev_eval_scores = {'perplexity': 1.155853509902954}
2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO -   train_loss = 0.22986294329166412
2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:37:34,325 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:37:34,325 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.155853509902954
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Steps = 63200/278576
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   dev_loss = 0.146531	||	 dev_eval_scores = {'perplexity': 1.1578106880187988}
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO -   train_loss = 0.2286159247159958
2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:44:45,801 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:44:48,841 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=11367
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1531232595443726
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO -   Steps = 63600/278576
2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO -   dev_loss = 0.142474	||	 dev_eval_scores = {'perplexity': 1.1531232595443726}
2020-06-12 15:44:48,865 - crisis_transformers.trainer - INFO -   train_loss = 0.22760188579559326
2020-06-12 15:44:48,865 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:52:00,452 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:52:04,180 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=11767
2020-06-12 15:52:04,195 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1524842977523804
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO -   Steps = 64000/278576
2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO -   dev_loss = 0.141920	||	 dev_eval_scores = {'perplexity': 1.1524842977523804}
2020-06-12 15:52:04,203 - crisis_transformers.trainer - INFO -   train_loss = 0.22645704448223114
2020-06-12 15:52:04,203 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 15:59:15,356 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:59:18,493 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=12167
2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1517306566238403
2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO -   Steps = 64400/278576
2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO -   dev_loss = 0.141266	||	 dev_eval_scores = {'perplexity': 1.1517306566238403}
2020-06-12 15:59:18,511 - crisis_transformers.trainer - INFO -   train_loss = 0.22518599033355713
2020-06-12 15:59:18,511 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:06:30,098 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:06:33,674 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=12567
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1506377458572388
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO -   Steps = 64800/278576
2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO -   dev_loss = 0.140316	||	 dev_eval_scores = {'perplexity': 1.1506377458572388}
2020-06-12 16:06:33,705 - crisis_transformers.trainer - INFO -   train_loss = 0.2241048812866211
2020-06-12 16:06:33,705 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:13:44,939 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:13:48,305 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=12967
2020-06-12 16:13:48,319 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:13:48,319 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1486502885818481
2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO -   Steps = 65200/278576
2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO -   dev_loss = 0.138588	||	 dev_eval_scores = {'perplexity': 1.1486502885818481}
2020-06-12 16:13:48,348 - crisis_transformers.trainer - INFO -   train_loss = 0.22296540439128876
2020-06-12 16:13:48,348 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:21:01,926 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:21:05,595 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=13367
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1468722820281982
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 17s
2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO -   Steps = 65600/278576
2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO -   dev_loss = 0.137039	||	 dev_eval_scores = {'perplexity': 1.1468722820281982}
2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO -   train_loss = 0.2220388650894165
2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:28:16,835 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:28:20,007 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=13767
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1454410552978516
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO -   Steps = 66000/278576
2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO -   dev_loss = 0.135790	||	 dev_eval_scores = {'perplexity': 1.1454410552978516}
2020-06-12 16:28:20,010 - crisis_transformers.trainer - INFO -   train_loss = 0.22106683254241943
2020-06-12 16:28:20,010 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:35:31,636 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:35:35,306 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=14167
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1435331106185913
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO -   Steps = 66400/278576
2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO -   dev_loss = 0.134123	||	 dev_eval_scores = {'perplexity': 1.1435331106185913}
2020-06-12 16:35:35,324 - crisis_transformers.trainer - INFO -   train_loss = 0.2200937420129776
2020-06-12 16:35:35,324 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1435331106185913
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Steps = 66800/278576
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   dev_loss = 0.139419	||	 dev_eval_scores = {'perplexity': 1.1496059894561768}
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO -   train_loss = 0.21907271444797516
2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:49:57,796 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:50:01,478 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=14967
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1425509452819824
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO -   Steps = 67200/278576
2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO -   dev_loss = 0.133263	||	 dev_eval_scores = {'perplexity': 1.1425509452819824}
2020-06-12 16:50:01,500 - crisis_transformers.trainer - INFO -   train_loss = 0.2180401235818863
2020-06-12 16:50:01,500 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 16:57:12,820 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:57:15,939 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=15367
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1421887874603271
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO -   Steps = 67600/278576
2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO -   dev_loss = 0.132946	||	 dev_eval_scores = {'perplexity': 1.1421887874603271}
2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO -   train_loss = 0.2170739322900772
2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1421887874603271
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   Steps = 68000/278576
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   dev_loss = 0.133038	||	 dev_eval_scores = {'perplexity': 1.1422935724258423}
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO -   train_loss = 0.21610437333583832
2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:11:38,816 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:11:41,945 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=16167
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1390084028244019
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO -   Steps = 68400/278576
2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO -   dev_loss = 0.130158	||	 dev_eval_scores = {'perplexity': 1.1390084028244019}
2020-06-12 17:11:41,975 - crisis_transformers.trainer - INFO -   train_loss = 0.21520239114761353
2020-06-12 17:11:41,975 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:18:54,648 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:18:54,648 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1390084028244019
2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO -   Steps = 68800/278576
2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO -   dev_loss = 0.130209	||	 dev_eval_scores = {'perplexity': 1.1390659809112549}
2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO -   train_loss = 0.21429920196533203
2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:26:06,245 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Early stop count = 2/20
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1390084028244019
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Steps = 69200/278576
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   dev_loss = 0.131858	||	 dev_eval_scores = {'perplexity': 1.1409462690353394}
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO -   train_loss = 0.2133215069770813
2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:33:18,361 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:33:21,478 - crisis_transformers.trainer - INFO -   Save check-point at epoch=3 step=17367
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.135805368423462
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO -   Epoch = 4/16
2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO -   Steps = 69600/278576
2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO -   dev_loss = 0.127342	||	 dev_eval_scores = {'perplexity': 1.135805368423462}
2020-06-12 17:33:21,485 - crisis_transformers.trainer - INFO -   train_loss = 0.21238122880458832
2020-06-12 17:33:21,485 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:33:46,659 - crisis_transformers.trainer - INFO - epoch 4 ends, 12 epoches left
2020-06-12 17:33:46,661 - crisis_transformers.trainer - INFO - 
global_average_loss=0.8768848776817322,global_steps=69644 on training set
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.135805368423462
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Steps = 70000/278576
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO -   dev_loss = 0.129323	||	 dev_eval_scores = {'perplexity': 1.138058066368103}
2020-06-12 17:40:33,523 - crisis_transformers.trainer - INFO -   train_loss = 0.16184912621974945
2020-06-12 17:40:33,523 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Early stop count = 2/20
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.135805368423462
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Steps = 70400/278576
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   dev_loss = 0.127871	||	 dev_eval_scores = {'perplexity': 1.1364060640335083}
2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO -   train_loss = 0.16288244724273682
2020-06-12 17:47:45,700 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 17:54:57,738 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:55:00,860 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=1156
2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1350326538085938
2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO -   Steps = 70800/278576
2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO -   dev_loss = 0.126661	||	 dev_eval_scores = {'perplexity': 1.1350326538085938}
2020-06-12 17:55:00,894 - crisis_transformers.trainer - INFO -   train_loss = 0.16178201138973236
2020-06-12 17:55:00,894 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:02:13,264 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:02:16,346 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=1556
2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1342060565948486
2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO -   Steps = 71200/278576
2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO -   dev_loss = 0.125933	||	 dev_eval_scores = {'perplexity': 1.1342060565948486}
2020-06-12 18:02:16,378 - crisis_transformers.trainer - INFO -   train_loss = 0.16210666298866272
2020-06-12 18:02:16,378 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:09:28,443 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:09:31,521 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=1956
2020-06-12 18:09:31,536 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:09:31,536 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.133441686630249
2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO -   Steps = 71600/278576
2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:09:31,539 - crisis_transformers.trainer - INFO -   dev_loss = 0.125259	||	 dev_eval_scores = {'perplexity': 1.133441686630249}
2020-06-12 18:09:31,556 - crisis_transformers.trainer - INFO -   train_loss = 0.16150160133838654
2020-06-12 18:09:31,556 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:16:44,613 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:16:47,595 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=2356
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1324461698532104
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO -   Steps = 72000/278576
2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO -   dev_loss = 0.124380	||	 dev_eval_scores = {'perplexity': 1.1324461698532104}
2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO -   train_loss = 0.160808727145195
2020-06-12 18:16:47,598 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:23:59,557 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:24:02,688 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=2756
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1319714784622192
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO -   Steps = 72400/278576
2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO -   dev_loss = 0.123961	||	 dev_eval_scores = {'perplexity': 1.1319714784622192}
2020-06-12 18:24:02,723 - crisis_transformers.trainer - INFO -   train_loss = 0.16037367284297943
2020-06-12 18:24:02,723 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:31:14,824 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:31:17,933 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=3156
2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.131042718887329
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Steps = 72800/278576
2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:31:17,951 - crisis_transformers.trainer - INFO -   dev_loss = 0.123140	||	 dev_eval_scores = {'perplexity': 1.131042718887329}
2020-06-12 18:31:17,969 - crisis_transformers.trainer - INFO -   train_loss = 0.16030311584472656
2020-06-12 18:31:17,970 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:38:30,048 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:38:33,201 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=3556
2020-06-12 18:38:33,201 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.130492091178894
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO -   Steps = 73200/278576
2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO -   dev_loss = 0.122653	||	 dev_eval_scores = {'perplexity': 1.130492091178894}
2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO -   train_loss = 0.16008983552455902
2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.130492091178894
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -   Steps = 73600/278576
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -   dev_loss = 0.126517	||	 dev_eval_scores = {'perplexity': 1.1348693370819092}
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO -   train_loss = 0.15986071527004242
2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Early stop count = 2/20
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.130492091178894
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Steps = 74000/278576
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   dev_loss = 0.123426	||	 dev_eval_scores = {'perplexity': 1.1313656568527222}
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO -   train_loss = 0.15936923027038574
2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:00:08,504 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:00:11,867 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=4756
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1283477544784546
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO -   Steps = 74400/278576
2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO -   dev_loss = 0.120754	||	 dev_eval_scores = {'perplexity': 1.1283477544784546}
2020-06-12 19:00:11,898 - crisis_transformers.trainer - INFO -   train_loss = 0.15900550782680511
2020-06-12 19:00:11,898 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1283477544784546
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Steps = 74800/278576
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO -   dev_loss = 0.122012	||	 dev_eval_scores = {'perplexity': 1.1297677755355835}
2020-06-12 19:07:24,082 - crisis_transformers.trainer - INFO -   train_loss = 0.15857523679733276
2020-06-12 19:07:24,082 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:14:35,509 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:14:38,680 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=5556
2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1280550956726074
2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO -   Steps = 75200/278576
2020-06-12 19:14:38,692 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:14:38,692 - crisis_transformers.trainer - INFO -   dev_loss = 0.120495	||	 dev_eval_scores = {'perplexity': 1.1280550956726074}
2020-06-12 19:14:38,709 - crisis_transformers.trainer - INFO -   train_loss = 0.15801657736301422
2020-06-12 19:14:38,709 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:21:51,621 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:21:51,621 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1280550956726074
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Steps = 75600/278576
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO -   dev_loss = 0.120551	||	 dev_eval_scores = {'perplexity': 1.1281187534332275}
2020-06-12 19:21:51,623 - crisis_transformers.trainer - INFO -   train_loss = 0.15759296715259552
2020-06-12 19:21:51,623 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:29:02,897 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:29:06,248 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=6356
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.126711368560791
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO -   Steps = 76000/278576
2020-06-12 19:29:06,259 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:29:06,259 - crisis_transformers.trainer - INFO -   dev_loss = 0.119303	||	 dev_eval_scores = {'perplexity': 1.126711368560791}
2020-06-12 19:29:06,271 - crisis_transformers.trainer - INFO -   train_loss = 0.15713545680046082
2020-06-12 19:29:06,271 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:36:18,058 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:36:21,263 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=6756
2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1253490447998047
2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO -   Steps = 76400/278576
2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO -   dev_loss = 0.118093	||	 dev_eval_scores = {'perplexity': 1.1253490447998047}
2020-06-12 19:36:21,285 - crisis_transformers.trainer - INFO -   train_loss = 0.15685245394706726
2020-06-12 19:36:21,285 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:43:33,123 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:43:36,558 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=7156
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1248881816864014
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO -   Steps = 76800/278576
2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO -   dev_loss = 0.117684	||	 dev_eval_scores = {'perplexity': 1.1248881816864014}
2020-06-12 19:43:36,592 - crisis_transformers.trainer - INFO -   train_loss = 0.15634950995445251
2020-06-12 19:43:36,592 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1248881816864014
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Steps = 77200/278576
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO -   dev_loss = 0.117770	||	 dev_eval_scores = {'perplexity': 1.1249850988388062}
2020-06-12 19:50:48,592 - crisis_transformers.trainer - INFO -   train_loss = 0.1558857262134552
2020-06-12 19:50:48,592 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 19:58:00,383 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:58:03,711 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=7956
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1242866516113281
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO -   Steps = 77600/278576
2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO -   dev_loss = 0.117149	||	 dev_eval_scores = {'perplexity': 1.1242866516113281}
2020-06-12 19:58:03,742 - crisis_transformers.trainer - INFO -   train_loss = 0.15545228123664856
2020-06-12 19:58:03,742 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:05:15,952 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:05:19,295 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=8356
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1240224838256836
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO -   Steps = 78000/278576
2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO -   dev_loss = 0.116914	||	 dev_eval_scores = {'perplexity': 1.1240224838256836}
2020-06-12 20:05:19,322 - crisis_transformers.trainer - INFO -   train_loss = 0.15514682233333588
2020-06-12 20:05:19,322 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:12:30,593 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:12:33,718 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=8756
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1232761144638062
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO -   Steps = 78400/278576
2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO -   dev_loss = 0.116250	||	 dev_eval_scores = {'perplexity': 1.1232761144638062}
2020-06-12 20:12:33,744 - crisis_transformers.trainer - INFO -   train_loss = 0.15461046993732452
2020-06-12 20:12:33,744 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:19:45,605 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1232761144638062
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Steps = 78800/278576
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   dev_loss = 0.117339	||	 dev_eval_scores = {'perplexity': 1.12450110912323}
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO -   train_loss = 0.15414145588874817
2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:26:57,592 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:27:00,603 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=9556
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1222569942474365
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO -   Steps = 79200/278576
2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO -   dev_loss = 0.115342	||	 dev_eval_scores = {'perplexity': 1.1222569942474365}
2020-06-12 20:27:00,635 - crisis_transformers.trainer - INFO -   train_loss = 0.15376870334148407
2020-06-12 20:27:00,635 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:34:13,086 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=9956
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1214654445648193
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO -   Steps = 79600/278576
2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO -   dev_loss = 0.114636	||	 dev_eval_scores = {'perplexity': 1.1214654445648193}
2020-06-12 20:34:16,745 - crisis_transformers.trainer - INFO -   train_loss = 0.1534395068883896
2020-06-12 20:34:16,745 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1214654445648193
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Steps = 80000/278576
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   dev_loss = 0.114919	||	 dev_eval_scores = {'perplexity': 1.1217821836471558}
2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO -   train_loss = 0.15301528573036194
2020-06-12 20:41:28,721 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:48:40,263 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Early stop count = 2/20
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1214654445648193
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Steps = 80400/278576
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   dev_loss = 0.116439	||	 dev_eval_scores = {'perplexity': 1.123489499092102}
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO -   train_loss = 0.1526806503534317
2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 20:55:51,280 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:55:54,387 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=11156
2020-06-12 20:55:54,402 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1208057403564453
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO -   Steps = 80800/278576
2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO -   dev_loss = 0.114048	||	 dev_eval_scores = {'perplexity': 1.1208057403564453}
2020-06-12 20:55:54,422 - crisis_transformers.trainer - INFO -   train_loss = 0.15231111645698547
2020-06-12 20:55:54,422 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1208057403564453
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO -   Steps = 81200/278576
2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO -   dev_loss = 0.114643	||	 dev_eval_scores = {'perplexity': 1.1214734315872192}
2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO -   train_loss = 0.15194369852542877
2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:10:17,767 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:10:20,890 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=11956
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1198288202285767
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO -   Steps = 81600/278576
2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO -   dev_loss = 0.113176	||	 dev_eval_scores = {'perplexity': 1.1198288202285767}
2020-06-12 21:10:20,900 - crisis_transformers.trainer - INFO -   train_loss = 0.15155275166034698
2020-06-12 21:10:20,900 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:17:33,024 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:17:36,142 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=12356
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1191425323486328
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO -   Steps = 82000/278576
2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO -   dev_loss = 0.112563	||	 dev_eval_scores = {'perplexity': 1.1191425323486328}
2020-06-12 21:17:36,151 - crisis_transformers.trainer - INFO -   train_loss = 0.15120427310466766
2020-06-12 21:17:36,151 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1191425323486328
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Steps = 82400/278576
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   dev_loss = 0.113121	||	 dev_eval_scores = {'perplexity': 1.1197669506072998}
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO -   train_loss = 0.1508565992116928
2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:32:00,891 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO -   Early stop count = 2/20
2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1191425323486328
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 12s
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Steps = 82800/278576
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   dev_loss = 0.114633	||	 dev_eval_scores = {'perplexity': 1.1214618682861328}
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO -   train_loss = 0.15052178502082825
2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Early stop count = 3/20
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1191425323486328
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 11s
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Steps = 83200/278576
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   dev_loss = 0.112780	||	 dev_eval_scores = {'perplexity': 1.1193852424621582}
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO -   train_loss = 0.15016387403011322
2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:46:23,357 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:46:26,472 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=13956
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1182266473770142
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO -   Steps = 83600/278576
2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO -   dev_loss = 0.111744	||	 dev_eval_scores = {'perplexity': 1.1182266473770142}
2020-06-12 21:46:26,507 - crisis_transformers.trainer - INFO -   train_loss = 0.14978548884391785
2020-06-12 21:46:26,507 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 21:53:38,893 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:53:42,051 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=14356
2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1176480054855347
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Steps = 84000/278576
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO -   dev_loss = 0.111226	||	 dev_eval_scores = {'perplexity': 1.1176480054855347}
2020-06-12 21:53:42,056 - crisis_transformers.trainer - INFO -   train_loss = 0.1494358777999878
2020-06-12 21:53:42,056 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 22:00:52,990 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:00:56,106 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=14756
2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1175894737243652
2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO -   Steps = 84400/278576
2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO -   dev_loss = 0.111174	||	 dev_eval_scores = {'perplexity': 1.1175894737243652}
2020-06-12 22:00:56,136 - crisis_transformers.trainer - INFO -   train_loss = 0.14908182621002197
2020-06-12 22:00:56,136 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1175894737243652
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 13s
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   Steps = 84800/278576
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO -   dev_loss = 0.111978	||	 dev_eval_scores = {'perplexity': 1.1184886693954468}
2020-06-12 22:08:09,604 - crisis_transformers.trainer - INFO -   train_loss = 0.1487855762243271
2020-06-12 22:08:09,604 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 22:15:29,618 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:15:33,237 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=15556
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1161783933639526
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 23s
2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO -   Steps = 85200/278576
2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO -   dev_loss = 0.109911	||	 dev_eval_scores = {'perplexity': 1.1161783933639526}
2020-06-12 22:15:33,242 - crisis_transformers.trainer - INFO -   train_loss = 0.14848528802394867
2020-06-12 22:15:33,242 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 22:22:49,305 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Early stop count = 1/20
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1161783933639526
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 16s
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Steps = 85600/278576
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   dev_loss = 0.110636	||	 dev_eval_scores = {'perplexity': 1.1169886589050293}
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO -   train_loss = 0.14812599122524261
2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 22:30:01,038 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:30:04,724 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=16356
2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1152490377426147
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 15s
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Steps = 86000/278576
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO -   dev_loss = 0.109078	||	 dev_eval_scores = {'perplexity': 1.1152490377426147}
2020-06-12 22:30:04,747 - crisis_transformers.trainer - INFO -   train_loss = 0.14784550666809082
2020-06-12 22:30:04,747 - crisis_transformers.trainer - INFO - 
********************************************
2020-06-12 22:37:16,018 - crisis_transformers.trainer - INFO -    Save model to tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO -   Save check-point at epoch=4 step=16756
2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO -    ***** Evaluation report *****
2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO -   Output path (short): tmp/gpt2_medium_for_source_code_code_generate
2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO -   Early stop on: perplexity
2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO -   Early stop count = 0/20
2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO -   Eval steps = 400 or (iterations = 400)
2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO -   Best score (perplexity) = -1.1139332056045532
2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO -   Gradient Accumulation steps = 1
2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO -   Num of training examples (actually no. of iterations per epoch for Iterable Dataset)  = 69642
2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO -   Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738
2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO -   Time spent since last evaluation = 0h 7m 14s
2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO -   Epoch = 5/16
2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO -   Steps = 86400/278576
2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO -   Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4
2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO -   dev_loss = 0.107897	||	 dev_eval_scores = {'perplexity': 1.1139332056045532}
2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO -   train_loss = 0.14752434194087982
2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - 
********************************************