2020-06-11 20:32:12,836 - crisis_transformers.trainer - INFO - Use pytorch device: cuda, with gpu_number=2 2020-06-11 20:32:14,855 - crisis_transformers.trainer - INFO - Warmup-steps: 55716 2020-06-11 20:32:14,856 - crisis_transformers.trainer - INFO - ***** Running training ***** 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Num of training examples (actually iterations per epoch for Iterable Dataset) = 69642 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Steps per Epoch = 17411 or iterations per epoch = 17411 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Num of Epochs = 16 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Best score (perplexity) = -inf 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Eval every 400 steps or every 400 iterations 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Early stop = 20 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Total optimization steps = 278576 2020-06-11 20:32:14,857 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 20:39:27,466 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=400 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Best score (perplexity) = -96754087231488.0 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 13s 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Steps = 400/278576 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - dev_loss = 32.203194 || dev_eval_scores = {'perplexity': 96754087231488.0} 2020-06-11 20:39:28,626 - crisis_transformers.trainer - INFO - train_loss = 34.87022018432617 2020-06-11 20:39:28,627 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 20:46:40,538 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=800 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Best score (perplexity) = -11848.5537109375 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-11 20:46:44,659 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - Steps = 800/278576 2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - dev_loss = 9.379961 || dev_eval_scores = {'perplexity': 11848.5537109375} 2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - train_loss = 22.25151824951172 2020-06-11 20:46:44,660 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 20:53:56,085 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=1200 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Best score (perplexity) = -82.30923461914062 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Steps = 1200/278576 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 20:53:59,828 - crisis_transformers.trainer - INFO - dev_loss = 4.410483 || dev_eval_scores = {'perplexity': 82.30923461914062} 2020-06-11 20:53:59,829 - crisis_transformers.trainer - INFO - train_loss = 15.844202995300293 2020-06-11 20:53:59,829 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Best score (perplexity) = -82.30923461914062 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:01:11,585 - crisis_transformers.trainer - INFO - Steps = 1600/278576 2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - dev_loss = 4.528600 || dev_eval_scores = {'perplexity': 92.62876892089844} 2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - train_loss = 12.387160301208496 2020-06-11 21:01:11,586 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:08:22,662 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:08:26,396 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=2000 2020-06-11 21:08:26,396 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Best score (perplexity) = -13.568625450134277 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Steps = 2000/278576 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - dev_loss = 2.607760 || dev_eval_scores = {'perplexity': 13.568625450134277} 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - train_loss = 10.256034851074219 2020-06-11 21:08:26,397 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:15:38,560 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=2400 2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:15:42,293 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Best score (perplexity) = -8.842060089111328 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Steps = 2400/278576 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - dev_loss = 2.179520 || dev_eval_scores = {'perplexity': 8.842060089111328} 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - train_loss = 8.810672760009766 2020-06-11 21:15:42,294 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:22:54,836 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=2800 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Best score (perplexity) = -6.2656636238098145 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Steps = 2800/278576 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - dev_loss = 1.835085 || dev_eval_scores = {'perplexity': 6.2656636238098145} 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - train_loss = 7.770569324493408 2020-06-11 21:22:58,961 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:30:11,274 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:30:11,274 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Best score (perplexity) = -6.2656636238098145 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Steps = 3200/278576 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - dev_loss = 1.864289 || dev_eval_scores = {'perplexity': 6.451348781585693} 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - train_loss = 6.985172748565674 2020-06-11 21:30:11,275 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:37:22,915 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=3600 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Best score (perplexity) = -4.507174015045166 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Steps = 3600/278576 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:37:27,050 - crisis_transformers.trainer - INFO - dev_loss = 1.505670 || dev_eval_scores = {'perplexity': 4.507174015045166} 2020-06-11 21:37:27,051 - crisis_transformers.trainer - INFO - train_loss = 6.368422985076904 2020-06-11 21:37:27,051 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:44:38,761 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=4000 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Best score (perplexity) = -4.046299457550049 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Steps = 4000/278576 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - dev_loss = 1.397803 || dev_eval_scores = {'perplexity': 4.046299457550049} 2020-06-11 21:44:42,509 - crisis_transformers.trainer - INFO - train_loss = 5.87202262878418 2020-06-11 21:44:42,510 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:51:54,477 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:51:58,342 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=4400 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.7213120460510254 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Steps = 4400/278576 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - dev_loss = 1.314076 || dev_eval_scores = {'perplexity': 3.7213120460510254} 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - train_loss = 5.462299346923828 2020-06-11 21:51:58,343 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 21:59:10,274 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=4800 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.609790325164795 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Steps = 4800/278576 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 21:59:14,545 - crisis_transformers.trainer - INFO - dev_loss = 1.283650 || dev_eval_scores = {'perplexity': 3.609790325164795} 2020-06-11 21:59:14,546 - crisis_transformers.trainer - INFO - train_loss = 5.118622303009033 2020-06-11 21:59:14,546 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:06:26,200 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=5200 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.5901994705200195 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Steps = 5200/278576 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:06:29,929 - crisis_transformers.trainer - INFO - dev_loss = 1.278208 || dev_eval_scores = {'perplexity': 3.5901994705200195} 2020-06-11 22:06:29,930 - crisis_transformers.trainer - INFO - train_loss = 4.828340530395508 2020-06-11 22:06:29,930 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:13:41,909 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=5600 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.394659996032715 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Steps = 5600/278576 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:13:46,201 - crisis_transformers.trainer - INFO - dev_loss = 1.222204 || dev_eval_scores = {'perplexity': 3.394659996032715} 2020-06-11 22:13:46,202 - crisis_transformers.trainer - INFO - train_loss = 4.5767903327941895 2020-06-11 22:13:46,202 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.394659996032715 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:20:57,774 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - Steps = 6000/278576 2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - dev_loss = 1.227896 || dev_eval_scores = {'perplexity': 3.41403865814209} 2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - train_loss = 4.356290340423584 2020-06-11 22:20:57,775 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:28:09,137 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:28:13,001 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=6400 2020-06-11 22:28:13,001 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.263387680053711 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Steps = 6400/278576 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - dev_loss = 1.182766 || dev_eval_scores = {'perplexity': 3.263387680053711} 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - train_loss = 4.162957668304443 2020-06-11 22:28:13,002 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.263387680053711 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:35:25,098 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Steps = 6800/278576 2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - dev_loss = 1.188739 || dev_eval_scores = {'perplexity': 3.2829389572143555} 2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - train_loss = 3.991642713546753 2020-06-11 22:35:25,099 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:42:37,188 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=7200 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.1454687118530273 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-11 22:42:41,101 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - Steps = 7200/278576 2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - dev_loss = 1.145963 || dev_eval_scores = {'perplexity': 3.1454687118530273} 2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - train_loss = 3.8384640216827393 2020-06-11 22:42:41,102 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:49:53,052 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=7600 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.0853919982910156 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Steps = 7600/278576 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:49:56,891 - crisis_transformers.trainer - INFO - dev_loss = 1.126679 || dev_eval_scores = {'perplexity': 3.0853919982910156} 2020-06-11 22:49:56,892 - crisis_transformers.trainer - INFO - train_loss = 3.701063871383667 2020-06-11 22:49:56,892 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 22:57:08,463 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:57:12,524 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=8000 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Best score (perplexity) = -3.04101824760437 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Steps = 8000/278576 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - dev_loss = 1.112192 || dev_eval_scores = {'perplexity': 3.04101824760437} 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - train_loss = 3.575878143310547 2020-06-11 22:57:12,525 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:04:23,771 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=8400 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.996488571166992 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Steps = 8400/278576 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:04:27,583 - crisis_transformers.trainer - INFO - dev_loss = 1.097441 || dev_eval_scores = {'perplexity': 2.996488571166992} 2020-06-11 23:04:27,584 - crisis_transformers.trainer - INFO - train_loss = 3.4622840881347656 2020-06-11 23:04:27,584 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:11:39,153 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:11:43,019 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=8800 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.9609262943267822 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Steps = 8800/278576 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - dev_loss = 1.085502 || dev_eval_scores = {'perplexity': 2.9609262943267822} 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - train_loss = 3.3580634593963623 2020-06-11 23:11:43,020 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:18:54,591 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=9200 2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:18:58,463 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.9230592250823975 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Steps = 9200/278576 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - dev_loss = 1.072631 || dev_eval_scores = {'perplexity': 2.9230592250823975} 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - train_loss = 3.2615966796875 2020-06-11 23:18:58,464 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:26:10,441 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=9600 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.886868715286255 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:26:13,940 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - Steps = 9600/278576 2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - dev_loss = 1.060172 || dev_eval_scores = {'perplexity': 2.886868715286255} 2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - train_loss = 3.1728854179382324 2020-06-11 23:26:13,941 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:33:25,705 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=10000 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.836120128631592 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:33:30,017 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - Steps = 10000/278576 2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - dev_loss = 1.042437 || dev_eval_scores = {'perplexity': 2.836120128631592} 2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - train_loss = 3.091641664505005 2020-06-11 23:33:30,018 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:40:41,837 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=10400 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.8059732913970947 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:40:45,836 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Steps = 10400/278576 2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - dev_loss = 1.031750 || dev_eval_scores = {'perplexity': 2.8059732913970947} 2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - train_loss = 3.0152587890625 2020-06-11 23:40:45,837 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:47:57,612 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=10800 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.772104263305664 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Steps = 10800/278576 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:48:02,018 - crisis_transformers.trainer - INFO - dev_loss = 1.019607 || dev_eval_scores = {'perplexity': 2.772104263305664} 2020-06-11 23:48:02,019 - crisis_transformers.trainer - INFO - train_loss = 2.9444949626922607 2020-06-11 23:48:02,019 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-11 23:55:13,323 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=11200 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.7492218017578125 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Steps = 11200/278576 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - dev_loss = 1.011318 || dev_eval_scores = {'perplexity': 2.7492218017578125} 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - train_loss = 2.8781516551971436 2020-06-11 23:55:17,272 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:02:29,325 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=11600 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.7139976024627686 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:02:33,443 - crisis_transformers.trainer - INFO - Steps = 11600/278576 2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - dev_loss = 0.998423 || dev_eval_scores = {'perplexity': 2.7139976024627686} 2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - train_loss = 2.816199541091919 2020-06-12 00:02:33,444 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:09:45,484 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=12000 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.68788480758667 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Steps = 12000/278576 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:09:49,444 - crisis_transformers.trainer - INFO - dev_loss = 0.988755 || dev_eval_scores = {'perplexity': 2.68788480758667} 2020-06-12 00:09:49,445 - crisis_transformers.trainer - INFO - train_loss = 2.7574315071105957 2020-06-12 00:09:49,445 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:17:00,535 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=12400 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.654526710510254 2020-06-12 00:17:04,295 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Steps = 12400/278576 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - dev_loss = 0.976266 || dev_eval_scores = {'perplexity': 2.654526710510254} 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - train_loss = 2.7026596069335938 2020-06-12 00:17:04,296 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:24:16,075 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=12800 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.6282081604003906 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 00:24:19,961 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - Steps = 12800/278576 2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - dev_loss = 0.966302 || dev_eval_scores = {'perplexity': 2.6282081604003906} 2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - train_loss = 2.650250196456909 2020-06-12 00:24:19,962 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:31:31,505 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=13200 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.6085095405578613 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Steps = 13200/278576 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - dev_loss = 0.958779 || dev_eval_scores = {'perplexity': 2.6085095405578613} 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - train_loss = 2.600867986679077 2020-06-12 00:31:35,203 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:38:46,576 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:38:50,406 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=13600 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.572343587875366 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Steps = 13600/278576 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - dev_loss = 0.944817 || dev_eval_scores = {'perplexity': 2.572343587875366} 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - train_loss = 2.55434513092041 2020-06-12 00:38:50,407 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:46:01,097 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=14000 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.5420730113983154 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Steps = 14000/278576 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - dev_loss = 0.932980 || dev_eval_scores = {'perplexity': 2.5420730113983154} 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - train_loss = 2.509788751602173 2020-06-12 00:46:05,257 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 00:53:16,943 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=14400 2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 00:53:20,794 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.508371114730835 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Steps = 14400/278576 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - dev_loss = 0.919634 || dev_eval_scores = {'perplexity': 2.508371114730835} 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - train_loss = 2.4678986072540283 2020-06-12 00:53:20,795 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:00:31,540 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=14800 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.4876623153686523 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Steps = 14800/278576 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - dev_loss = 0.911343 || dev_eval_scores = {'perplexity': 2.4876623153686523} 2020-06-12 01:00:35,779 - crisis_transformers.trainer - INFO - train_loss = 2.428267002105713 2020-06-12 01:00:35,780 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:07:47,187 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=15200 2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:07:51,019 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.46309757232666 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Steps = 15200/278576 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - dev_loss = 0.901420 || dev_eval_scores = {'perplexity': 2.46309757232666} 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - train_loss = 2.3902571201324463 2020-06-12 01:07:51,020 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:15:02,762 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=15600 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.4311938285827637 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:15:06,595 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Steps = 15600/278576 2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - dev_loss = 0.888382 || dev_eval_scores = {'perplexity': 2.4311938285827637} 2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - train_loss = 2.354424238204956 2020-06-12 01:15:06,596 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:22:18,392 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=16000 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.4099924564361572 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:22:22,372 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Steps = 16000/278576 2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - dev_loss = 0.879624 || dev_eval_scores = {'perplexity': 2.4099924564361572} 2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - train_loss = 2.319091796875 2020-06-12 01:22:22,373 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:29:33,829 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:29:37,786 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=16400 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.3866090774536133 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Steps = 16400/278576 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - dev_loss = 0.869874 || dev_eval_scores = {'perplexity': 2.3866090774536133} 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - train_loss = 2.285768747329712 2020-06-12 01:29:37,787 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:36:48,893 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=16800 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.3575546741485596 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Steps = 16800/278576 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - dev_loss = 0.857625 || dev_eval_scores = {'perplexity': 2.3575546741485596} 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - train_loss = 2.253655433654785 2020-06-12 01:36:52,733 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:44:04,158 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Save check-point at epoch=0 step=17200 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.3273518085479736 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Epoch = 1/16 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Steps = 17200/278576 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:44:08,000 - crisis_transformers.trainer - INFO - dev_loss = 0.844731 || dev_eval_scores = {'perplexity': 2.3273518085479736} 2020-06-12 01:44:08,001 - crisis_transformers.trainer - INFO - train_loss = 2.22310733795166 2020-06-12 01:44:08,001 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:46:08,626 - crisis_transformers.trainer - INFO - epoch 1 ends, 15 epoches left 2020-06-12 01:46:08,628 - crisis_transformers.trainer - INFO - global_average_loss=2.207577705383301,global_steps=17411 on training set 2020-06-12 01:51:18,826 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=189 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.306220054626465 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Steps = 17600/278576 2020-06-12 01:51:22,899 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO - dev_loss = 0.835610 || dev_eval_scores = {'perplexity': 2.306220054626465} 2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO - train_loss = 0.8919339776039124 2020-06-12 01:51:22,900 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 01:58:34,239 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=589 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.282672166824341 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 01:58:38,515 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Steps = 18000/278576 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - dev_loss = 0.825347 || dev_eval_scores = {'perplexity': 2.282672166824341} 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - train_loss = 0.9102271199226379 2020-06-12 01:58:38,516 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:05:49,927 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:05:53,796 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=989 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.261589288711548 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Steps = 18400/278576 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - dev_loss = 0.816068 || dev_eval_scores = {'perplexity': 2.261589288711548} 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - train_loss = 0.9082818031311035 2020-06-12 02:05:53,797 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:13:04,838 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=1389 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.2358696460723877 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Steps = 18800/278576 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - dev_loss = 0.804630 || dev_eval_scores = {'perplexity': 2.2358696460723877} 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - train_loss = 0.9062302708625793 2020-06-12 02:13:08,577 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:20:19,901 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=1789 2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:20:23,756 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.210675001144409 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Steps = 19200/278576 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - dev_loss = 0.793298 || dev_eval_scores = {'perplexity': 2.210675001144409} 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - train_loss = 0.8982809782028198 2020-06-12 02:20:23,757 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:27:35,383 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=2189 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.1900641918182373 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Steps = 19600/278576 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:27:39,229 - crisis_transformers.trainer - INFO - dev_loss = 0.783931 || dev_eval_scores = {'perplexity': 2.1900641918182373} 2020-06-12 02:27:39,230 - crisis_transformers.trainer - INFO - train_loss = 0.8898405432701111 2020-06-12 02:27:39,230 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:34:50,458 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=2589 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.169043779373169 2020-06-12 02:34:54,307 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Steps = 20000/278576 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - dev_loss = 0.774286 || dev_eval_scores = {'perplexity': 2.169043779373169} 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - train_loss = 0.8851494789123535 2020-06-12 02:34:54,308 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:42:06,145 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=2989 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.140289783477783 2020-06-12 02:42:09,896 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Steps = 20400/278576 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - dev_loss = 0.760941 || dev_eval_scores = {'perplexity': 2.140289783477783} 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - train_loss = 0.8793467283248901 2020-06-12 02:42:09,897 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:49:21,789 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=3389 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.121040105819702 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Steps = 20800/278576 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:49:25,274 - crisis_transformers.trainer - INFO - dev_loss = 0.751907 || dev_eval_scores = {'perplexity': 2.121040105819702} 2020-06-12 02:49:25,275 - crisis_transformers.trainer - INFO - train_loss = 0.871684730052948 2020-06-12 02:49:25,275 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 02:56:37,107 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:56:41,332 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=3789 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0992023944854736 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Steps = 21200/278576 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - dev_loss = 0.741557 || dev_eval_scores = {'perplexity': 2.0992023944854736} 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - train_loss = 0.8670705556869507 2020-06-12 02:56:41,333 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:03:52,858 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=4189 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0718014240264893 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:03:56,767 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Steps = 21600/278576 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - dev_loss = 0.728418 || dev_eval_scores = {'perplexity': 2.0718014240264893} 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - train_loss = 0.8624909520149231 2020-06-12 03:03:56,768 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:11:08,681 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=4589 2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:11:12,544 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0530827045440674 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Steps = 22000/278576 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - dev_loss = 0.719342 || dev_eval_scores = {'perplexity': 2.0530827045440674} 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - train_loss = 0.8574584126472473 2020-06-12 03:11:12,545 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:18:23,725 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:18:27,499 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=4989 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0336155891418457 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Steps = 22400/278576 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - dev_loss = 0.709815 || dev_eval_scores = {'perplexity': 2.0336155891418457} 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - train_loss = 0.8504697680473328 2020-06-12 03:18:27,500 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:25:39,057 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=5389 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Best score (perplexity) = -2.0098047256469727 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Steps = 22800/278576 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - dev_loss = 0.698038 || dev_eval_scores = {'perplexity': 2.0098047256469727} 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - train_loss = 0.8460281491279602 2020-06-12 03:25:43,393 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:32:54,836 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:32:58,726 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=5789 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9868273735046387 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Steps = 23200/278576 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - dev_loss = 0.686539 || dev_eval_scores = {'perplexity': 1.9868273735046387} 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - train_loss = 0.8407670855522156 2020-06-12 03:32:58,727 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:40:09,514 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=6189 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9708765745162964 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Steps = 23600/278576 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:40:13,425 - crisis_transformers.trainer - INFO - dev_loss = 0.678478 || dev_eval_scores = {'perplexity': 1.9708765745162964} 2020-06-12 03:40:13,426 - crisis_transformers.trainer - INFO - train_loss = 0.8364319205284119 2020-06-12 03:40:13,426 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:47:25,312 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=6589 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.950257420539856 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Steps = 24000/278576 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:47:28,665 - crisis_transformers.trainer - INFO - dev_loss = 0.667961 || dev_eval_scores = {'perplexity': 1.950257420539856} 2020-06-12 03:47:28,666 - crisis_transformers.trainer - INFO - train_loss = 0.8307074308395386 2020-06-12 03:47:28,666 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 03:54:40,261 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:54:44,220 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=6989 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9251375198364258 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Steps = 24400/278576 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - dev_loss = 0.654997 || dev_eval_scores = {'perplexity': 1.9251375198364258} 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - train_loss = 0.8253822922706604 2020-06-12 03:54:44,221 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:01:55,402 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:01:59,672 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=7389 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.9091284275054932 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Steps = 24800/278576 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - dev_loss = 0.646647 || dev_eval_scores = {'perplexity': 1.9091284275054932} 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - train_loss = 0.8205176591873169 2020-06-12 04:01:59,673 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:09:10,549 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=7789 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8884336948394775 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Steps = 25200/278576 2020-06-12 04:09:14,287 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO - dev_loss = 0.635748 || dev_eval_scores = {'perplexity': 1.8884336948394775} 2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO - train_loss = 0.8162734508514404 2020-06-12 04:09:14,288 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:16:26,145 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=8189 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8698856830596924 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:16:30,187 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Steps = 25600/278576 2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - dev_loss = 0.625877 || dev_eval_scores = {'perplexity': 1.8698856830596924} 2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - train_loss = 0.811458170413971 2020-06-12 04:16:30,188 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:23:41,295 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=8589 2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:23:45,346 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8487507104873657 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Steps = 26000/278576 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - dev_loss = 0.614510 || dev_eval_scores = {'perplexity': 1.8487507104873657} 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - train_loss = 0.8066449761390686 2020-06-12 04:23:45,347 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:30:56,930 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=8989 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8316272497177124 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Steps = 26400/278576 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:31:01,239 - crisis_transformers.trainer - INFO - dev_loss = 0.605205 || dev_eval_scores = {'perplexity': 1.8316272497177124} 2020-06-12 04:31:01,240 - crisis_transformers.trainer - INFO - train_loss = 0.802518904209137 2020-06-12 04:31:01,240 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:38:12,636 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=9389 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.8143984079360962 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Steps = 26800/278576 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:38:16,325 - crisis_transformers.trainer - INFO - dev_loss = 0.595754 || dev_eval_scores = {'perplexity': 1.8143984079360962} 2020-06-12 04:38:16,326 - crisis_transformers.trainer - INFO - train_loss = 0.7970281839370728 2020-06-12 04:38:16,326 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:45:27,487 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=9789 2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:45:31,784 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7961244583129883 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Steps = 27200/278576 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - dev_loss = 0.585631 || dev_eval_scores = {'perplexity': 1.7961244583129883} 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - train_loss = 0.7928464412689209 2020-06-12 04:45:31,785 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:52:42,555 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=10189 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7843670845031738 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 04:52:46,377 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - Steps = 27600/278576 2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - dev_loss = 0.579064 || dev_eval_scores = {'perplexity': 1.7843670845031738} 2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - train_loss = 0.7886567711830139 2020-06-12 04:52:46,378 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 04:59:57,976 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=10589 2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:00:01,818 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.764768123626709 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Steps = 28000/278576 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - dev_loss = 0.568019 || dev_eval_scores = {'perplexity': 1.764768123626709} 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - train_loss = 0.7840659618377686 2020-06-12 05:00:01,819 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:07:12,938 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=10989 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7483080625534058 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Steps = 28400/278576 2020-06-12 05:07:16,789 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO - dev_loss = 0.558648 || dev_eval_scores = {'perplexity': 1.7483080625534058} 2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO - train_loss = 0.7798082828521729 2020-06-12 05:07:16,790 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:14:27,901 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=11389 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7391985654830933 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Steps = 28800/278576 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:14:32,042 - crisis_transformers.trainer - INFO - dev_loss = 0.553424 || dev_eval_scores = {'perplexity': 1.7391985654830933} 2020-06-12 05:14:32,043 - crisis_transformers.trainer - INFO - train_loss = 0.7753340601921082 2020-06-12 05:14:32,043 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:21:43,000 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=11789 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.7198398113250732 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Steps = 29200/278576 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:21:46,790 - crisis_transformers.trainer - INFO - dev_loss = 0.542231 || dev_eval_scores = {'perplexity': 1.7198398113250732} 2020-06-12 05:21:46,791 - crisis_transformers.trainer - INFO - train_loss = 0.7709231972694397 2020-06-12 05:21:46,791 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:28:58,330 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:29:02,113 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=12189 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6953392028808594 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Steps = 29600/278576 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - dev_loss = 0.527883 || dev_eval_scores = {'perplexity': 1.6953392028808594} 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - train_loss = 0.7663020491600037 2020-06-12 05:29:02,114 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:36:13,699 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=12589 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.684487223625183 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Steps = 30000/278576 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - dev_loss = 0.521461 || dev_eval_scores = {'perplexity': 1.684487223625183} 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - train_loss = 0.7620025277137756 2020-06-12 05:36:17,973 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:43:29,168 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=12989 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6656123399734497 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 05:43:33,180 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - Steps = 30400/278576 2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - dev_loss = 0.510193 || dev_eval_scores = {'perplexity': 1.6656123399734497} 2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - train_loss = 0.7573661208152771 2020-06-12 05:43:33,181 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:50:44,231 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=13389 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.651535987854004 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Steps = 30800/278576 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:50:48,564 - crisis_transformers.trainer - INFO - dev_loss = 0.501706 || dev_eval_scores = {'perplexity': 1.651535987854004} 2020-06-12 05:50:48,565 - crisis_transformers.trainer - INFO - train_loss = 0.7526668906211853 2020-06-12 05:50:48,565 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 05:57:59,864 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:58:03,697 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=13789 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6361221075057983 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Steps = 31200/278576 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - dev_loss = 0.492329 || dev_eval_scores = {'perplexity': 1.6361221075057983} 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - train_loss = 0.7481159567832947 2020-06-12 05:58:03,698 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:05:14,807 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:05:18,556 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=14189 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.6220632791519165 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Steps = 31600/278576 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - dev_loss = 0.483699 || dev_eval_scores = {'perplexity': 1.6220632791519165} 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - train_loss = 0.7438127994537354 2020-06-12 06:05:18,557 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:12:30,611 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=14589 2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:12:34,360 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.610316276550293 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Steps = 32000/278576 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - dev_loss = 0.476431 || dev_eval_scores = {'perplexity': 1.610316276550293} 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - train_loss = 0.7390771508216858 2020-06-12 06:12:34,361 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:19:45,562 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=14989 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5992158651351929 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Steps = 32400/278576 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:19:49,380 - crisis_transformers.trainer - INFO - dev_loss = 0.469513 || dev_eval_scores = {'perplexity': 1.5992158651351929} 2020-06-12 06:19:49,381 - crisis_transformers.trainer - INFO - train_loss = 0.7345605492591858 2020-06-12 06:19:49,381 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:27:00,549 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=15389 2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:27:04,344 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.581007480621338 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Steps = 32800/278576 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - dev_loss = 0.458062 || dev_eval_scores = {'perplexity': 1.581007480621338} 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - train_loss = 0.7301263809204102 2020-06-12 06:27:04,345 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:34:16,032 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=15789 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.571539044380188 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Steps = 33200/278576 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:34:19,879 - crisis_transformers.trainer - INFO - dev_loss = 0.452055 || dev_eval_scores = {'perplexity': 1.571539044380188} 2020-06-12 06:34:19,880 - crisis_transformers.trainer - INFO - train_loss = 0.7254016399383545 2020-06-12 06:34:19,880 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:41:30,980 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:41:34,818 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=16189 2020-06-12 06:41:34,818 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5561637878417969 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Steps = 33600/278576 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - dev_loss = 0.442224 || dev_eval_scores = {'perplexity': 1.5561637878417969} 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - train_loss = 0.7213504314422607 2020-06-12 06:41:34,819 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:48:46,476 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=16589 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5447299480438232 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:48:50,324 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Steps = 34000/278576 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - dev_loss = 0.434849 || dev_eval_scores = {'perplexity': 1.5447299480438232} 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - train_loss = 0.7172350883483887 2020-06-12 06:48:50,325 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 06:56:02,163 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=16989 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5317991971969604 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Steps = 34400/278576 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - dev_loss = 0.426443 || dev_eval_scores = {'perplexity': 1.5317991971969604} 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - train_loss = 0.7126625180244446 2020-06-12 06:56:05,999 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:03:17,948 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Save check-point at epoch=1 step=17389 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5181857347488403 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Epoch = 2/16 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Steps = 34800/278576 2020-06-12 07:03:22,232 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO - dev_loss = 0.417516 || dev_eval_scores = {'perplexity': 1.5181857347488403} 2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO - train_loss = 0.7084499001502991 2020-06-12 07:03:22,233 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:03:34,765 - crisis_transformers.trainer - INFO - epoch 2 ends, 14 epoches left 2020-06-12 07:03:34,767 - crisis_transformers.trainer - INFO - global_average_loss=1.457897663116455,global_steps=34822 on training set 2020-06-12 07:10:33,578 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=378 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.5065593719482422 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Steps = 35200/278576 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:10:37,319 - crisis_transformers.trainer - INFO - dev_loss = 0.409828 || dev_eval_scores = {'perplexity': 1.5065593719482422} 2020-06-12 07:10:37,320 - crisis_transformers.trainer - INFO - train_loss = 0.5077053904533386 2020-06-12 07:10:37,320 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:17:48,578 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:17:52,866 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=778 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4995406866073608 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Steps = 35600/278576 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:17:52,867 - crisis_transformers.trainer - INFO - dev_loss = 0.405159 || dev_eval_scores = {'perplexity': 1.4995406866073608} 2020-06-12 07:17:52,868 - crisis_transformers.trainer - INFO - train_loss = 0.5084229111671448 2020-06-12 07:17:52,868 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:25:04,279 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=1178 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4872630834579468 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Steps = 36000/278576 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - dev_loss = 0.396938 || dev_eval_scores = {'perplexity': 1.4872630834579468} 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - train_loss = 0.5008477568626404 2020-06-12 07:25:08,441 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:32:19,554 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=1578 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4724860191345215 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Steps = 36400/278576 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - dev_loss = 0.386952 || dev_eval_scores = {'perplexity': 1.4724860191345215} 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - train_loss = 0.49597886204719543 2020-06-12 07:32:23,792 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:39:35,241 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=1978 2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:39:39,063 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4704307317733765 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Steps = 36800/278576 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - dev_loss = 0.385555 || dev_eval_scores = {'perplexity': 1.4704307317733765} 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - train_loss = 0.4925973117351532 2020-06-12 07:39:39,064 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:46:50,212 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:46:54,091 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=2378 2020-06-12 07:46:54,091 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4530060291290283 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Steps = 37200/278576 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - dev_loss = 0.373635 || dev_eval_scores = {'perplexity': 1.4530060291290283} 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - train_loss = 0.4874938130378723 2020-06-12 07:46:54,092 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 07:54:05,635 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=2778 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4396711587905884 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Steps = 37600/278576 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 07:54:09,864 - crisis_transformers.trainer - INFO - dev_loss = 0.364415 || dev_eval_scores = {'perplexity': 1.4396711587905884} 2020-06-12 07:54:09,865 - crisis_transformers.trainer - INFO - train_loss = 0.4848006069660187 2020-06-12 07:54:09,865 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:01:21,053 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=3178 2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:01:25,371 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4313539266586304 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Steps = 38000/278576 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - dev_loss = 0.358621 || dev_eval_scores = {'perplexity': 1.4313539266586304} 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - train_loss = 0.4819473922252655 2020-06-12 08:01:25,372 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:08:36,819 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=3578 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4217482805252075 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Steps = 38400/278576 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:08:40,629 - crisis_transformers.trainer - INFO - dev_loss = 0.351887 || dev_eval_scores = {'perplexity': 1.4217482805252075} 2020-06-12 08:08:40,630 - crisis_transformers.trainer - INFO - train_loss = 0.47801968455314636 2020-06-12 08:08:40,630 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:15:51,675 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=3978 2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:15:55,394 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4130357503890991 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Steps = 38800/278576 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - dev_loss = 0.345740 || dev_eval_scores = {'perplexity': 1.4130357503890991} 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - train_loss = 0.4750809073448181 2020-06-12 08:15:55,395 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:23:07,471 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=4378 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.4030141830444336 2020-06-12 08:23:11,318 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Steps = 39200/278576 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - dev_loss = 0.338623 || dev_eval_scores = {'perplexity': 1.4030141830444336} 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - train_loss = 0.4713905453681946 2020-06-12 08:23:11,319 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:30:22,512 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=4778 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.391721487045288 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:30:26,227 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Steps = 39600/278576 2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - dev_loss = 0.330542 || dev_eval_scores = {'perplexity': 1.391721487045288} 2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - train_loss = 0.4688350260257721 2020-06-12 08:30:26,228 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:37:37,836 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=5178 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3863202333450317 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Steps = 40000/278576 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:37:41,641 - crisis_transformers.trainer - INFO - dev_loss = 0.326653 || dev_eval_scores = {'perplexity': 1.3863202333450317} 2020-06-12 08:37:41,642 - crisis_transformers.trainer - INFO - train_loss = 0.4653063714504242 2020-06-12 08:37:41,642 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:44:52,891 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:44:56,707 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=5578 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3764472007751465 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Steps = 40400/278576 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - dev_loss = 0.319506 || dev_eval_scores = {'perplexity': 1.3764472007751465} 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - train_loss = 0.4616211950778961 2020-06-12 08:44:56,708 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:52:07,568 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=5978 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3701869249343872 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Steps = 40800/278576 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - dev_loss = 0.314947 || dev_eval_scores = {'perplexity': 1.3701869249343872} 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - train_loss = 0.4588777422904968 2020-06-12 08:52:11,277 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 08:59:22,571 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=6378 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3665746450424194 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Steps = 41200/278576 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - dev_loss = 0.312307 || dev_eval_scores = {'perplexity': 1.3665746450424194} 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - train_loss = 0.4553696811199188 2020-06-12 08:59:26,446 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:06:37,730 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=6778 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.35618257522583 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:06:41,569 - crisis_transformers.trainer - INFO - Steps = 41600/278576 2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - dev_loss = 0.304674 || dev_eval_scores = {'perplexity': 1.35618257522583} 2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - train_loss = 0.4522298574447632 2020-06-12 09:06:41,570 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:13:52,744 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:13:56,458 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=7178 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.347740888595581 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Steps = 42000/278576 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - dev_loss = 0.298430 || dev_eval_scores = {'perplexity': 1.347740888595581} 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - train_loss = 0.4496159553527832 2020-06-12 09:13:56,459 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:21:07,688 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=7578 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3386039733886719 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Steps = 42400/278576 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:21:11,509 - crisis_transformers.trainer - INFO - dev_loss = 0.291627 || dev_eval_scores = {'perplexity': 1.3386039733886719} 2020-06-12 09:21:11,510 - crisis_transformers.trainer - INFO - train_loss = 0.44657158851623535 2020-06-12 09:21:11,510 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:28:22,370 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=7978 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3325072526931763 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Steps = 42800/278576 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - dev_loss = 0.287062 || dev_eval_scores = {'perplexity': 1.3325072526931763} 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - train_loss = 0.4432190954685211 2020-06-12 09:28:26,197 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:35:37,349 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=8378 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3274871110916138 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Steps = 43200/278576 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:35:41,210 - crisis_transformers.trainer - INFO - dev_loss = 0.283288 || dev_eval_scores = {'perplexity': 1.3274871110916138} 2020-06-12 09:35:41,211 - crisis_transformers.trainer - INFO - train_loss = 0.4402396082878113 2020-06-12 09:35:41,211 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:42:52,444 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=8778 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.315727710723877 2020-06-12 09:42:56,294 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Steps = 43600/278576 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - dev_loss = 0.274390 || dev_eval_scores = {'perplexity': 1.315727710723877} 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - train_loss = 0.43725401163101196 2020-06-12 09:42:56,295 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:50:07,902 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=9178 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3152892589569092 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Steps = 44000/278576 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - dev_loss = 0.274057 || dev_eval_scores = {'perplexity': 1.3152892589569092} 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - train_loss = 0.43421682715415955 2020-06-12 09:50:12,219 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 09:57:23,789 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=9578 2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 09:57:27,692 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.306022047996521 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Steps = 44400/278576 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - dev_loss = 0.266986 || dev_eval_scores = {'perplexity': 1.306022047996521} 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - train_loss = 0.43124547600746155 2020-06-12 09:57:27,693 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:04:38,908 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:04:42,860 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=9978 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.3020490407943726 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Steps = 44800/278576 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - dev_loss = 0.263939 || dev_eval_scores = {'perplexity': 1.3020490407943726} 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - train_loss = 0.42827215790748596 2020-06-12 10:04:42,861 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:11:53,676 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=10378 2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:11:57,556 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2914035320281982 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Steps = 45200/278576 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - dev_loss = 0.255730 || dev_eval_scores = {'perplexity': 1.2914035320281982} 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - train_loss = 0.4252185821533203 2020-06-12 10:11:57,557 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:19:08,790 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=10778 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2862516641616821 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 10:19:12,161 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - Steps = 45600/278576 2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - dev_loss = 0.251732 || dev_eval_scores = {'perplexity': 1.2862516641616821} 2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - train_loss = 0.4227924942970276 2020-06-12 10:19:12,162 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:26:22,943 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=11178 2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:26:27,198 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.282753825187683 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Steps = 46000/278576 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - dev_loss = 0.249009 || dev_eval_scores = {'perplexity': 1.282753825187683} 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - train_loss = 0.41981765627861023 2020-06-12 10:26:27,199 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:33:37,608 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:33:41,331 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=11578 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2747212648391724 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Steps = 46400/278576 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - dev_loss = 0.242728 || dev_eval_scores = {'perplexity': 1.2747212648391724} 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - train_loss = 0.41709834337234497 2020-06-12 10:33:41,332 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:40:52,557 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=11978 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.270970106124878 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Steps = 46800/278576 2020-06-12 10:40:56,240 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO - dev_loss = 0.239781 || dev_eval_scores = {'perplexity': 1.270970106124878} 2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO - train_loss = 0.41438814997673035 2020-06-12 10:40:56,241 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:48:07,184 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=12378 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2681171894073486 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Steps = 47200/278576 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:48:11,365 - crisis_transformers.trainer - INFO - dev_loss = 0.237533 || dev_eval_scores = {'perplexity': 1.2681171894073486} 2020-06-12 10:48:11,366 - crisis_transformers.trainer - INFO - train_loss = 0.4116062819957733 2020-06-12 10:48:11,366 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 10:55:21,839 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 10:55:21,839 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2681171894073486 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 10s 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Steps = 47600/278576 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - dev_loss = 0.237603 || dev_eval_scores = {'perplexity': 1.268206000328064} 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - train_loss = 0.4090476334095001 2020-06-12 10:55:21,840 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:02:33,440 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=13178 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2570724487304688 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Steps = 48000/278576 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - dev_loss = 0.228786 || dev_eval_scores = {'perplexity': 1.2570724487304688} 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - train_loss = 0.40632161498069763 2020-06-12 11:02:37,076 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:09:48,383 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:09:51,983 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=13578 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.252106785774231 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Steps = 48400/278576 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - dev_loss = 0.224828 || dev_eval_scores = {'perplexity': 1.252106785774231} 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - train_loss = 0.40354153513908386 2020-06-12 11:09:51,984 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:17:03,692 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=13978 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2468602657318115 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:17:07,245 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - Steps = 48800/278576 2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - dev_loss = 0.220629 || dev_eval_scores = {'perplexity': 1.2468602657318115} 2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - train_loss = 0.40087953209877014 2020-06-12 11:17:07,246 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:24:17,752 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:24:21,750 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=14378 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.241665244102478 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Steps = 49200/278576 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - dev_loss = 0.216453 || dev_eval_scores = {'perplexity': 1.241665244102478} 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - train_loss = 0.3984794318675995 2020-06-12 11:24:21,751 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:31:32,613 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:31:36,096 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=14778 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2373579740524292 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Steps = 49600/278576 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - dev_loss = 0.212978 || dev_eval_scores = {'perplexity': 1.2373579740524292} 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - train_loss = 0.3960722088813782 2020-06-12 11:31:36,097 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:38:48,257 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=15178 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2339407205581665 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Steps = 50000/278576 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - dev_loss = 0.210213 || dev_eval_scores = {'perplexity': 1.2339407205581665} 2020-06-12 11:38:51,723 - crisis_transformers.trainer - INFO - train_loss = 0.39352700114250183 2020-06-12 11:38:51,724 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:46:03,817 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=15578 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2312901020050049 2020-06-12 11:46:07,113 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Steps = 50400/278576 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - dev_loss = 0.208062 || dev_eval_scores = {'perplexity': 1.2312901020050049} 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - train_loss = 0.39096903800964355 2020-06-12 11:46:07,114 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 11:53:18,503 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:53:21,880 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=15978 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2268590927124023 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Steps = 50800/278576 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 11:53:21,881 - crisis_transformers.trainer - INFO - dev_loss = 0.204457 || dev_eval_scores = {'perplexity': 1.2268590927124023} 2020-06-12 11:53:21,882 - crisis_transformers.trainer - INFO - train_loss = 0.3884601294994354 2020-06-12 11:53:21,882 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:00:33,247 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:00:36,497 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=16378 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.224045991897583 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Steps = 51200/278576 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - dev_loss = 0.202162 || dev_eval_scores = {'perplexity': 1.224045991897583} 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - train_loss = 0.38594967126846313 2020-06-12 12:00:36,498 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:07:47,479 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:07:51,260 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=16778 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.21811842918396 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Steps = 51600/278576 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - dev_loss = 0.197307 || dev_eval_scores = {'perplexity': 1.21811842918396} 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - train_loss = 0.3834236264228821 2020-06-12 12:07:51,261 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:15:01,910 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:15:05,101 - crisis_transformers.trainer - INFO - Save check-point at epoch=2 step=17178 2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:15:05,103 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2158831357955933 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 13s 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Epoch = 3/16 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Steps = 52000/278576 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - dev_loss = 0.195471 || dev_eval_scores = {'perplexity': 1.2158831357955933} 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - train_loss = 0.3809737265110016 2020-06-12 12:15:05,104 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:17:19,046 - crisis_transformers.trainer - INFO - epoch 3 ends, 13 epoches left 2020-06-12 12:17:19,049 - crisis_transformers.trainer - INFO - global_average_loss=1.0984210968017578,global_steps=52233 on training set 2020-06-12 12:22:16,490 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=167 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2130458354949951 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Steps = 52400/278576 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:22:19,854 - crisis_transformers.trainer - INFO - dev_loss = 0.193134 || dev_eval_scores = {'perplexity': 1.2130458354949951} 2020-06-12 12:22:19,855 - crisis_transformers.trainer - INFO - train_loss = 0.25673729181289673 2020-06-12 12:22:19,855 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:29:30,907 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=567 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.210909128189087 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:29:34,508 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - Steps = 52800/278576 2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - dev_loss = 0.191371 || dev_eval_scores = {'perplexity': 1.210909128189087} 2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - train_loss = 0.25955724716186523 2020-06-12 12:29:34,509 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:36:45,199 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=967 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2072252035140991 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:36:48,918 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Steps = 53200/278576 2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - dev_loss = 0.188325 || dev_eval_scores = {'perplexity': 1.2072252035140991} 2020-06-12 12:36:48,919 - crisis_transformers.trainer - INFO - train_loss = 0.2595442533493042 2020-06-12 12:36:48,920 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:43:59,962 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:44:03,265 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=1367 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2031614780426025 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:44:03,280 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Steps = 53600/278576 2020-06-12 12:44:03,281 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:44:03,282 - crisis_transformers.trainer - INFO - dev_loss = 0.184953 || dev_eval_scores = {'perplexity': 1.2031614780426025} 2020-06-12 12:44:03,297 - crisis_transformers.trainer - INFO - train_loss = 0.25946447253227234 2020-06-12 12:44:03,297 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:51:15,360 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:51:18,633 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=1767 2020-06-12 12:51:18,648 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:51:18,648 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538 2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:51:18,649 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Steps = 54000/278576 2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:51:18,650 - crisis_transformers.trainer - INFO - dev_loss = 0.184105 || dev_eval_scores = {'perplexity': 1.2021421194076538} 2020-06-12 12:51:18,669 - crisis_transformers.trainer - INFO - train_loss = 0.2583455443382263 2020-06-12 12:51:18,669 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Steps = 54400/278576 2020-06-12 12:58:30,080 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO - dev_loss = 0.184208 || dev_eval_scores = {'perplexity': 1.2022662162780762} 2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO - train_loss = 0.25629064440727234 2020-06-12 12:58:30,081 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:05:42,132 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Early stop count = 2/20 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:05:42,133 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Steps = 54800/278576 2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - dev_loss = 0.187595 || dev_eval_scores = {'perplexity': 1.2063448429107666} 2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - train_loss = 0.25469452142715454 2020-06-12 13:05:42,134 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:12:53,734 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Early stop count = 3/20 2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:12:53,736 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.2021421194076538 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Steps = 55200/278576 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - dev_loss = 0.184251 || dev_eval_scores = {'perplexity': 1.2023180723190308} 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - train_loss = 0.25338509678840637 2020-06-12 13:12:53,737 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:20:04,638 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:20:07,839 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=3367 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.191463589668274 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:20:07,854 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:20:07,855 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - Steps = 55600/278576 2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:20:07,856 - crisis_transformers.trainer - INFO - dev_loss = 0.175182 || dev_eval_scores = {'perplexity': 1.191463589668274} 2020-06-12 13:20:07,876 - crisis_transformers.trainer - INFO - train_loss = 0.2521595358848572 2020-06-12 13:20:07,876 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:27:19,652 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:27:22,821 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=3767 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1891900300979614 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:27:22,836 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Steps = 56000/278576 2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:27:22,837 - crisis_transformers.trainer - INFO - dev_loss = 0.173272 || dev_eval_scores = {'perplexity': 1.1891900300979614} 2020-06-12 13:27:22,856 - crisis_transformers.trainer - INFO - train_loss = 0.251168429851532 2020-06-12 13:27:22,856 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1891900300979614 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Steps = 56400/278576 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - dev_loss = 0.174895 || dev_eval_scores = {'perplexity': 1.1911215782165527} 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - train_loss = 0.2495262175798416 2020-06-12 13:34:34,678 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:41:45,895 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:41:49,254 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=4567 2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1830016374588013 2020-06-12 13:41:49,270 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:41:49,271 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:41:49,271 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Steps = 56800/278576 2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:41:49,272 - crisis_transformers.trainer - INFO - dev_loss = 0.168055 || dev_eval_scores = {'perplexity': 1.1830016374588013} 2020-06-12 13:41:49,290 - crisis_transformers.trainer - INFO - train_loss = 0.24820448458194733 2020-06-12 13:41:49,290 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:49:01,184 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:49:04,330 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=4967 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1823091506958008 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:49:04,346 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Steps = 57200/278576 2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:49:04,348 - crisis_transformers.trainer - INFO - dev_loss = 0.167469 || dev_eval_scores = {'perplexity': 1.1823091506958008} 2020-06-12 13:49:04,366 - crisis_transformers.trainer - INFO - train_loss = 0.24676309525966644 2020-06-12 13:49:04,366 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1823091506958008 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Steps = 57600/278576 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - dev_loss = 0.168122 || dev_eval_scores = {'perplexity': 1.1830805540084839} 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - train_loss = 0.2456163614988327 2020-06-12 13:56:16,108 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:03:27,611 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Early stop count = 2/20 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1823091506958008 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 14:03:27,612 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - Steps = 58000/278576 2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - dev_loss = 0.168360 || dev_eval_scores = {'perplexity': 1.1833629608154297} 2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - train_loss = 0.24447211623191833 2020-06-12 14:03:27,613 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:10:39,369 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:10:42,563 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=6167 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1757748126983643 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:10:42,580 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Steps = 58400/278576 2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:10:42,581 - crisis_transformers.trainer - INFO - dev_loss = 0.161927 || dev_eval_scores = {'perplexity': 1.1757748126983643} 2020-06-12 14:10:42,599 - crisis_transformers.trainer - INFO - train_loss = 0.24326415359973907 2020-06-12 14:10:42,599 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:17:54,473 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:17:57,675 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=6567 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1724531650543213 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:17:57,691 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Steps = 58800/278576 2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:17:57,692 - crisis_transformers.trainer - INFO - dev_loss = 0.159098 || dev_eval_scores = {'perplexity': 1.1724531650543213} 2020-06-12 14:17:57,710 - crisis_transformers.trainer - INFO - train_loss = 0.24194355309009552 2020-06-12 14:17:57,710 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:25:09,648 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:25:13,392 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=6967 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1723699569702148 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:25:13,407 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Steps = 59200/278576 2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:25:13,408 - crisis_transformers.trainer - INFO - dev_loss = 0.159027 || dev_eval_scores = {'perplexity': 1.1723699569702148} 2020-06-12 14:25:13,426 - crisis_transformers.trainer - INFO - train_loss = 0.24068011343479156 2020-06-12 14:25:13,426 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:32:25,725 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:32:28,988 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=7367 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1677402257919312 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:32:29,004 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Steps = 59600/278576 2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:32:29,005 - crisis_transformers.trainer - INFO - dev_loss = 0.155071 || dev_eval_scores = {'perplexity': 1.1677402257919312} 2020-06-12 14:32:29,024 - crisis_transformers.trainer - INFO - train_loss = 0.23944209516048431 2020-06-12 14:32:29,024 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1677402257919312 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Steps = 60000/278576 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - dev_loss = 0.160010 || dev_eval_scores = {'perplexity': 1.1735223531723022} 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - train_loss = 0.23837868869304657 2020-06-12 14:39:41,605 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:46:52,988 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:46:55,997 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=8167 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1659613847732544 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:46:56,013 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:46:56,014 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - Steps = 60400/278576 2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:46:56,015 - crisis_transformers.trainer - INFO - dev_loss = 0.153546 || dev_eval_scores = {'perplexity': 1.1659613847732544} 2020-06-12 14:46:56,033 - crisis_transformers.trainer - INFO - train_loss = 0.23715026676654816 2020-06-12 14:46:56,033 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 14:54:07,918 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:54:10,976 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=8567 2020-06-12 14:54:10,989 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.165794014930725 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Steps = 60800/278576 2020-06-12 14:54:10,990 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 14:54:10,991 - crisis_transformers.trainer - INFO - dev_loss = 0.153402 || dev_eval_scores = {'perplexity': 1.165794014930725} 2020-06-12 14:54:11,007 - crisis_transformers.trainer - INFO - train_loss = 0.23593905568122864 2020-06-12 14:54:11,007 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:01:22,795 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:01:25,973 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=8967 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.161121129989624 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:01:25,989 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Steps = 61200/278576 2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:01:25,990 - crisis_transformers.trainer - INFO - dev_loss = 0.149386 || dev_eval_scores = {'perplexity': 1.161121129989624} 2020-06-12 15:01:26,009 - crisis_transformers.trainer - INFO - train_loss = 0.2347162663936615 2020-06-12 15:01:26,009 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:08:37,941 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:08:41,095 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=9367 2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:08:41,110 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.159262776374817 2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:08:41,111 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Steps = 61600/278576 2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:08:41,112 - crisis_transformers.trainer - INFO - dev_loss = 0.147784 || dev_eval_scores = {'perplexity': 1.159262776374817} 2020-06-12 15:08:41,131 - crisis_transformers.trainer - INFO - train_loss = 0.2334117591381073 2020-06-12 15:08:41,131 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:15:53,144 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.159262776374817 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Steps = 62000/278576 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - dev_loss = 0.149117 || dev_eval_scores = {'perplexity': 1.1608085632324219} 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - train_loss = 0.23231545090675354 2020-06-12 15:15:53,145 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:23:04,214 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:23:07,484 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=10167 2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:23:07,500 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1586897373199463 2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:23:07,501 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Steps = 62400/278576 2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:23:07,502 - crisis_transformers.trainer - INFO - dev_loss = 0.147290 || dev_eval_scores = {'perplexity': 1.1586897373199463} 2020-06-12 15:23:07,520 - crisis_transformers.trainer - INFO - train_loss = 0.23095978796482086 2020-06-12 15:23:07,520 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:30:19,155 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:30:22,774 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=10567 2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 15:30:22,775 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.155853509902954 2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:30:22,776 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Steps = 62800/278576 2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - dev_loss = 0.144839 || dev_eval_scores = {'perplexity': 1.155853509902954} 2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - train_loss = 0.22986294329166412 2020-06-12 15:30:22,777 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:37:34,325 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:37:34,325 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.155853509902954 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Steps = 63200/278576 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - dev_loss = 0.146531 || dev_eval_scores = {'perplexity': 1.1578106880187988} 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - train_loss = 0.2286159247159958 2020-06-12 15:37:34,326 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:44:45,801 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:44:48,841 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=11367 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1531232595443726 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:44:48,857 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Steps = 63600/278576 2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:44:48,858 - crisis_transformers.trainer - INFO - dev_loss = 0.142474 || dev_eval_scores = {'perplexity': 1.1531232595443726} 2020-06-12 15:44:48,865 - crisis_transformers.trainer - INFO - train_loss = 0.22760188579559326 2020-06-12 15:44:48,865 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:52:00,452 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:52:04,180 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=11767 2020-06-12 15:52:04,195 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1524842977523804 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:52:04,196 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Steps = 64000/278576 2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:52:04,197 - crisis_transformers.trainer - INFO - dev_loss = 0.141920 || dev_eval_scores = {'perplexity': 1.1524842977523804} 2020-06-12 15:52:04,203 - crisis_transformers.trainer - INFO - train_loss = 0.22645704448223114 2020-06-12 15:52:04,203 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 15:59:15,356 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:59:18,493 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=12167 2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 15:59:18,496 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1517306566238403 2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 15:59:18,497 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Steps = 64400/278576 2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 15:59:18,498 - crisis_transformers.trainer - INFO - dev_loss = 0.141266 || dev_eval_scores = {'perplexity': 1.1517306566238403} 2020-06-12 15:59:18,511 - crisis_transformers.trainer - INFO - train_loss = 0.22518599033355713 2020-06-12 15:59:18,511 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:06:30,098 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:06:33,674 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=12567 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1506377458572388 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:06:33,689 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Steps = 64800/278576 2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:06:33,690 - crisis_transformers.trainer - INFO - dev_loss = 0.140316 || dev_eval_scores = {'perplexity': 1.1506377458572388} 2020-06-12 16:06:33,705 - crisis_transformers.trainer - INFO - train_loss = 0.2241048812866211 2020-06-12 16:06:33,705 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:13:44,939 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:13:48,305 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=12967 2020-06-12 16:13:48,319 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:13:48,319 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1486502885818481 2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:13:48,320 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Steps = 65200/278576 2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:13:48,321 - crisis_transformers.trainer - INFO - dev_loss = 0.138588 || dev_eval_scores = {'perplexity': 1.1486502885818481} 2020-06-12 16:13:48,348 - crisis_transformers.trainer - INFO - train_loss = 0.22296540439128876 2020-06-12 16:13:48,348 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:21:01,926 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:21:05,595 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=13367 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1468722820281982 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:21:05,596 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 17s 2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Steps = 65600/278576 2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - dev_loss = 0.137039 || dev_eval_scores = {'perplexity': 1.1468722820281982} 2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - train_loss = 0.2220388650894165 2020-06-12 16:21:05,597 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:28:16,835 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:28:20,007 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=13767 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1454410552978516 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:28:20,008 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Steps = 66000/278576 2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:28:20,009 - crisis_transformers.trainer - INFO - dev_loss = 0.135790 || dev_eval_scores = {'perplexity': 1.1454410552978516} 2020-06-12 16:28:20,010 - crisis_transformers.trainer - INFO - train_loss = 0.22106683254241943 2020-06-12 16:28:20,010 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:35:31,636 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:35:35,306 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=14167 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1435331106185913 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:35:35,322 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Steps = 66400/278576 2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:35:35,323 - crisis_transformers.trainer - INFO - dev_loss = 0.134123 || dev_eval_scores = {'perplexity': 1.1435331106185913} 2020-06-12 16:35:35,324 - crisis_transformers.trainer - INFO - train_loss = 0.2200937420129776 2020-06-12 16:35:35,324 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:42:46,589 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1435331106185913 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Steps = 66800/278576 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - dev_loss = 0.139419 || dev_eval_scores = {'perplexity': 1.1496059894561768} 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - train_loss = 0.21907271444797516 2020-06-12 16:42:46,590 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:49:57,796 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:50:01,478 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=14967 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1425509452819824 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:50:01,491 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Steps = 67200/278576 2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:50:01,492 - crisis_transformers.trainer - INFO - dev_loss = 0.133263 || dev_eval_scores = {'perplexity': 1.1425509452819824} 2020-06-12 16:50:01,500 - crisis_transformers.trainer - INFO - train_loss = 0.2180401235818863 2020-06-12 16:50:01,500 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 16:57:12,820 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:57:15,939 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=15367 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1421887874603271 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 16:57:15,940 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Steps = 67600/278576 2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - dev_loss = 0.132946 || dev_eval_scores = {'perplexity': 1.1421887874603271} 2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - train_loss = 0.2170739322900772 2020-06-12 16:57:15,941 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:04:27,191 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1421887874603271 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Steps = 68000/278576 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - dev_loss = 0.133038 || dev_eval_scores = {'perplexity': 1.1422935724258423} 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - train_loss = 0.21610437333583832 2020-06-12 17:04:27,192 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:11:38,816 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:11:41,945 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=16167 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1390084028244019 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:11:41,958 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Steps = 68400/278576 2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:11:41,959 - crisis_transformers.trainer - INFO - dev_loss = 0.130158 || dev_eval_scores = {'perplexity': 1.1390084028244019} 2020-06-12 17:11:41,975 - crisis_transformers.trainer - INFO - train_loss = 0.21520239114761353 2020-06-12 17:11:41,975 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:18:54,648 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:18:54,648 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1390084028244019 2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:18:54,649 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Steps = 68800/278576 2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - dev_loss = 0.130209 || dev_eval_scores = {'perplexity': 1.1390659809112549} 2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - train_loss = 0.21429920196533203 2020-06-12 17:18:54,650 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:26:06,245 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Early stop count = 2/20 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1390084028244019 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Steps = 69200/278576 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - dev_loss = 0.131858 || dev_eval_scores = {'perplexity': 1.1409462690353394} 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - train_loss = 0.2133215069770813 2020-06-12 17:26:06,248 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:33:18,361 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:33:21,478 - crisis_transformers.trainer - INFO - Save check-point at epoch=3 step=17367 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.135805368423462 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:33:21,481 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Epoch = 4/16 2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Steps = 69600/278576 2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:33:21,483 - crisis_transformers.trainer - INFO - dev_loss = 0.127342 || dev_eval_scores = {'perplexity': 1.135805368423462} 2020-06-12 17:33:21,485 - crisis_transformers.trainer - INFO - train_loss = 0.21238122880458832 2020-06-12 17:33:21,485 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:33:46,659 - crisis_transformers.trainer - INFO - epoch 4 ends, 12 epoches left 2020-06-12 17:33:46,661 - crisis_transformers.trainer - INFO - global_average_loss=0.8768848776817322,global_steps=69644 on training set 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.135805368423462 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Steps = 70000/278576 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:40:33,522 - crisis_transformers.trainer - INFO - dev_loss = 0.129323 || dev_eval_scores = {'perplexity': 1.138058066368103} 2020-06-12 17:40:33,523 - crisis_transformers.trainer - INFO - train_loss = 0.16184912621974945 2020-06-12 17:40:33,523 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Early stop count = 2/20 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.135805368423462 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Steps = 70400/278576 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - dev_loss = 0.127871 || dev_eval_scores = {'perplexity': 1.1364060640335083} 2020-06-12 17:47:45,699 - crisis_transformers.trainer - INFO - train_loss = 0.16288244724273682 2020-06-12 17:47:45,700 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 17:54:57,738 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:55:00,860 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=1156 2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 17:55:00,875 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1350326538085938 2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 17:55:00,876 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - Steps = 70800/278576 2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 17:55:00,877 - crisis_transformers.trainer - INFO - dev_loss = 0.126661 || dev_eval_scores = {'perplexity': 1.1350326538085938} 2020-06-12 17:55:00,894 - crisis_transformers.trainer - INFO - train_loss = 0.16178201138973236 2020-06-12 17:55:00,894 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:02:13,264 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:02:16,346 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=1556 2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:02:16,360 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1342060565948486 2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 18:02:16,361 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO - Steps = 71200/278576 2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:02:16,362 - crisis_transformers.trainer - INFO - dev_loss = 0.125933 || dev_eval_scores = {'perplexity': 1.1342060565948486} 2020-06-12 18:02:16,378 - crisis_transformers.trainer - INFO - train_loss = 0.16210666298866272 2020-06-12 18:02:16,378 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:09:28,443 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:09:31,521 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=1956 2020-06-12 18:09:31,536 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:09:31,536 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.133441686630249 2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:09:31,537 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Steps = 71600/278576 2020-06-12 18:09:31,538 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:09:31,539 - crisis_transformers.trainer - INFO - dev_loss = 0.125259 || dev_eval_scores = {'perplexity': 1.133441686630249} 2020-06-12 18:09:31,556 - crisis_transformers.trainer - INFO - train_loss = 0.16150160133838654 2020-06-12 18:09:31,556 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:16:44,613 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:16:47,595 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=2356 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1324461698532104 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:16:47,596 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Steps = 72000/278576 2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - dev_loss = 0.124380 || dev_eval_scores = {'perplexity': 1.1324461698532104} 2020-06-12 18:16:47,597 - crisis_transformers.trainer - INFO - train_loss = 0.160808727145195 2020-06-12 18:16:47,598 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:23:59,557 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:24:02,688 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=2756 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1319714784622192 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:24:02,703 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Steps = 72400/278576 2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:24:02,704 - crisis_transformers.trainer - INFO - dev_loss = 0.123961 || dev_eval_scores = {'perplexity': 1.1319714784622192} 2020-06-12 18:24:02,723 - crisis_transformers.trainer - INFO - train_loss = 0.16037367284297943 2020-06-12 18:24:02,723 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:31:14,824 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:31:17,933 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=3156 2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 18:31:17,949 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.131042718887329 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Steps = 72800/278576 2020-06-12 18:31:17,950 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:31:17,951 - crisis_transformers.trainer - INFO - dev_loss = 0.123140 || dev_eval_scores = {'perplexity': 1.131042718887329} 2020-06-12 18:31:17,969 - crisis_transformers.trainer - INFO - train_loss = 0.16030311584472656 2020-06-12 18:31:17,970 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:38:30,048 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:38:33,201 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=3556 2020-06-12 18:38:33,201 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.130492091178894 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 18:38:33,202 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - Steps = 73200/278576 2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - dev_loss = 0.122653 || dev_eval_scores = {'perplexity': 1.130492091178894} 2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - train_loss = 0.16008983552455902 2020-06-12 18:38:33,203 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.130492091178894 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:45:45,223 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Steps = 73600/278576 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - dev_loss = 0.126517 || dev_eval_scores = {'perplexity': 1.1348693370819092} 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - train_loss = 0.15986071527004242 2020-06-12 18:45:45,224 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Early stop count = 2/20 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.130492091178894 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Steps = 74000/278576 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - dev_loss = 0.123426 || dev_eval_scores = {'perplexity': 1.1313656568527222} 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - train_loss = 0.15936923027038574 2020-06-12 18:52:56,441 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:00:08,504 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:00:11,867 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=4756 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1283477544784546 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:00:11,881 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Steps = 74400/278576 2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:00:11,882 - crisis_transformers.trainer - INFO - dev_loss = 0.120754 || dev_eval_scores = {'perplexity': 1.1283477544784546} 2020-06-12 19:00:11,898 - crisis_transformers.trainer - INFO - train_loss = 0.15900550782680511 2020-06-12 19:00:11,898 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1283477544784546 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Steps = 74800/278576 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:07:24,081 - crisis_transformers.trainer - INFO - dev_loss = 0.122012 || dev_eval_scores = {'perplexity': 1.1297677755355835} 2020-06-12 19:07:24,082 - crisis_transformers.trainer - INFO - train_loss = 0.15857523679733276 2020-06-12 19:07:24,082 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:14:35,509 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:14:38,680 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=5556 2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:14:38,690 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1280550956726074 2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:14:38,691 - crisis_transformers.trainer - INFO - Steps = 75200/278576 2020-06-12 19:14:38,692 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:14:38,692 - crisis_transformers.trainer - INFO - dev_loss = 0.120495 || dev_eval_scores = {'perplexity': 1.1280550956726074} 2020-06-12 19:14:38,709 - crisis_transformers.trainer - INFO - train_loss = 0.15801657736301422 2020-06-12 19:14:38,709 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:21:51,621 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:21:51,621 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1280550956726074 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Steps = 75600/278576 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:21:51,622 - crisis_transformers.trainer - INFO - dev_loss = 0.120551 || dev_eval_scores = {'perplexity': 1.1281187534332275} 2020-06-12 19:21:51,623 - crisis_transformers.trainer - INFO - train_loss = 0.15759296715259552 2020-06-12 19:21:51,623 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:29:02,897 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:29:06,248 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=6356 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.126711368560791 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:29:06,257 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:29:06,258 - crisis_transformers.trainer - INFO - Steps = 76000/278576 2020-06-12 19:29:06,259 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:29:06,259 - crisis_transformers.trainer - INFO - dev_loss = 0.119303 || dev_eval_scores = {'perplexity': 1.126711368560791} 2020-06-12 19:29:06,271 - crisis_transformers.trainer - INFO - train_loss = 0.15713545680046082 2020-06-12 19:29:06,271 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:36:18,058 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:36:21,263 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=6756 2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:36:21,270 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1253490447998047 2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:36:21,271 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Steps = 76400/278576 2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:36:21,272 - crisis_transformers.trainer - INFO - dev_loss = 0.118093 || dev_eval_scores = {'perplexity': 1.1253490447998047} 2020-06-12 19:36:21,285 - crisis_transformers.trainer - INFO - train_loss = 0.15685245394706726 2020-06-12 19:36:21,285 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:43:33,123 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:43:36,558 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=7156 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1248881816864014 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:43:36,573 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Steps = 76800/278576 2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:43:36,574 - crisis_transformers.trainer - INFO - dev_loss = 0.117684 || dev_eval_scores = {'perplexity': 1.1248881816864014} 2020-06-12 19:43:36,592 - crisis_transformers.trainer - INFO - train_loss = 0.15634950995445251 2020-06-12 19:43:36,592 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1248881816864014 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Steps = 77200/278576 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:50:48,591 - crisis_transformers.trainer - INFO - dev_loss = 0.117770 || dev_eval_scores = {'perplexity': 1.1249850988388062} 2020-06-12 19:50:48,592 - crisis_transformers.trainer - INFO - train_loss = 0.1558857262134552 2020-06-12 19:50:48,592 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 19:58:00,383 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:58:03,711 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=7956 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1242866516113281 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 19:58:03,725 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Steps = 77600/278576 2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 19:58:03,726 - crisis_transformers.trainer - INFO - dev_loss = 0.117149 || dev_eval_scores = {'perplexity': 1.1242866516113281} 2020-06-12 19:58:03,742 - crisis_transformers.trainer - INFO - train_loss = 0.15545228123664856 2020-06-12 19:58:03,742 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:05:15,952 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:05:19,295 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=8356 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1240224838256836 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:05:19,307 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Steps = 78000/278576 2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:05:19,308 - crisis_transformers.trainer - INFO - dev_loss = 0.116914 || dev_eval_scores = {'perplexity': 1.1240224838256836} 2020-06-12 20:05:19,322 - crisis_transformers.trainer - INFO - train_loss = 0.15514682233333588 2020-06-12 20:05:19,322 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:12:30,593 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:12:33,718 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=8756 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1232761144638062 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:12:33,729 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Steps = 78400/278576 2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:12:33,730 - crisis_transformers.trainer - INFO - dev_loss = 0.116250 || dev_eval_scores = {'perplexity': 1.1232761144638062} 2020-06-12 20:12:33,744 - crisis_transformers.trainer - INFO - train_loss = 0.15461046993732452 2020-06-12 20:12:33,744 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:19:45,605 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1232761144638062 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Steps = 78800/278576 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - dev_loss = 0.117339 || dev_eval_scores = {'perplexity': 1.12450110912323} 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - train_loss = 0.15414145588874817 2020-06-12 20:19:45,606 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:26:57,592 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:27:00,603 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=9556 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1222569942474365 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:27:00,617 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Steps = 79200/278576 2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:27:00,618 - crisis_transformers.trainer - INFO - dev_loss = 0.115342 || dev_eval_scores = {'perplexity': 1.1222569942474365} 2020-06-12 20:27:00,635 - crisis_transformers.trainer - INFO - train_loss = 0.15376870334148407 2020-06-12 20:27:00,635 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:34:13,086 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=9956 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1214654445648193 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:34:16,742 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Steps = 79600/278576 2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:34:16,743 - crisis_transformers.trainer - INFO - dev_loss = 0.114636 || dev_eval_scores = {'perplexity': 1.1214654445648193} 2020-06-12 20:34:16,745 - crisis_transformers.trainer - INFO - train_loss = 0.1534395068883896 2020-06-12 20:34:16,745 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1214654445648193 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Steps = 80000/278576 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - dev_loss = 0.114919 || dev_eval_scores = {'perplexity': 1.1217821836471558} 2020-06-12 20:41:28,720 - crisis_transformers.trainer - INFO - train_loss = 0.15301528573036194 2020-06-12 20:41:28,721 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:48:40,263 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Early stop count = 2/20 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1214654445648193 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Steps = 80400/278576 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - dev_loss = 0.116439 || dev_eval_scores = {'perplexity': 1.123489499092102} 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - train_loss = 0.1526806503534317 2020-06-12 20:48:40,265 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 20:55:51,280 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:55:54,387 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=11156 2020-06-12 20:55:54,402 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1208057403564453 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 20:55:54,403 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Steps = 80800/278576 2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 20:55:54,404 - crisis_transformers.trainer - INFO - dev_loss = 0.114048 || dev_eval_scores = {'perplexity': 1.1208057403564453} 2020-06-12 20:55:54,422 - crisis_transformers.trainer - INFO - train_loss = 0.15231111645698547 2020-06-12 20:55:54,422 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1208057403564453 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 21:03:06,140 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - Steps = 81200/278576 2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - dev_loss = 0.114643 || dev_eval_scores = {'perplexity': 1.1214734315872192} 2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - train_loss = 0.15194369852542877 2020-06-12 21:03:06,141 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:10:17,767 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:10:20,890 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=11956 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1198288202285767 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:10:20,897 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Steps = 81600/278576 2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:10:20,898 - crisis_transformers.trainer - INFO - dev_loss = 0.113176 || dev_eval_scores = {'perplexity': 1.1198288202285767} 2020-06-12 21:10:20,900 - crisis_transformers.trainer - INFO - train_loss = 0.15155275166034698 2020-06-12 21:10:20,900 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:17:33,024 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:17:36,142 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=12356 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:17:36,143 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Steps = 82000/278576 2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:17:36,144 - crisis_transformers.trainer - INFO - dev_loss = 0.112563 || dev_eval_scores = {'perplexity': 1.1191425323486328} 2020-06-12 21:17:36,151 - crisis_transformers.trainer - INFO - train_loss = 0.15120427310466766 2020-06-12 21:17:36,151 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Steps = 82400/278576 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - dev_loss = 0.113121 || dev_eval_scores = {'perplexity': 1.1197669506072998} 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - train_loss = 0.1508565992116928 2020-06-12 21:24:48,557 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:32:00,891 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Early stop count = 2/20 2020-06-12 21:32:00,893 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 12s 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Steps = 82800/278576 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - dev_loss = 0.114633 || dev_eval_scores = {'perplexity': 1.1214618682861328} 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - train_loss = 0.15052178502082825 2020-06-12 21:32:00,894 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Early stop count = 3/20 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1191425323486328 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 11s 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Steps = 83200/278576 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - dev_loss = 0.112780 || dev_eval_scores = {'perplexity': 1.1193852424621582} 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - train_loss = 0.15016387403011322 2020-06-12 21:39:11,977 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:46:23,357 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:46:26,472 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=13956 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1182266473770142 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:46:26,487 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Steps = 83600/278576 2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:46:26,489 - crisis_transformers.trainer - INFO - dev_loss = 0.111744 || dev_eval_scores = {'perplexity': 1.1182266473770142} 2020-06-12 21:46:26,507 - crisis_transformers.trainer - INFO - train_loss = 0.14978548884391785 2020-06-12 21:46:26,507 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 21:53:38,893 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:53:42,051 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=14356 2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 21:53:42,053 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1176480054855347 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Steps = 84000/278576 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 21:53:42,054 - crisis_transformers.trainer - INFO - dev_loss = 0.111226 || dev_eval_scores = {'perplexity': 1.1176480054855347} 2020-06-12 21:53:42,056 - crisis_transformers.trainer - INFO - train_loss = 0.1494358777999878 2020-06-12 21:53:42,056 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 22:00:52,990 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:00:56,106 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=14756 2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 22:00:56,119 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1175894737243652 2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 22:00:56,120 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Steps = 84400/278576 2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 22:00:56,121 - crisis_transformers.trainer - INFO - dev_loss = 0.111174 || dev_eval_scores = {'perplexity': 1.1175894737243652} 2020-06-12 22:00:56,136 - crisis_transformers.trainer - INFO - train_loss = 0.14908182621002197 2020-06-12 22:00:56,136 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 22:08:09,602 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1175894737243652 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 13s 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Steps = 84800/278576 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 22:08:09,603 - crisis_transformers.trainer - INFO - dev_loss = 0.111978 || dev_eval_scores = {'perplexity': 1.1184886693954468} 2020-06-12 22:08:09,604 - crisis_transformers.trainer - INFO - train_loss = 0.1487855762243271 2020-06-12 22:08:09,604 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 22:15:29,618 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:15:33,237 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=15556 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1161783933639526 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 22:15:33,238 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 23s 2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Steps = 85200/278576 2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 22:15:33,240 - crisis_transformers.trainer - INFO - dev_loss = 0.109911 || dev_eval_scores = {'perplexity': 1.1161783933639526} 2020-06-12 22:15:33,242 - crisis_transformers.trainer - INFO - train_loss = 0.14848528802394867 2020-06-12 22:15:33,242 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 22:22:49,305 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Early stop count = 1/20 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1161783933639526 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 16s 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Steps = 85600/278576 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - dev_loss = 0.110636 || dev_eval_scores = {'perplexity': 1.1169886589050293} 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - train_loss = 0.14812599122524261 2020-06-12 22:22:49,306 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 22:30:01,038 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:30:04,724 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=16356 2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 22:30:04,730 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1152490377426147 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 15s 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Steps = 86000/278576 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 22:30:04,731 - crisis_transformers.trainer - INFO - dev_loss = 0.109078 || dev_eval_scores = {'perplexity': 1.1152490377426147} 2020-06-12 22:30:04,747 - crisis_transformers.trainer - INFO - train_loss = 0.14784550666809082 2020-06-12 22:30:04,747 - crisis_transformers.trainer - INFO - ******************************************** 2020-06-12 22:37:16,018 - crisis_transformers.trainer - INFO - Save model to tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Save check-point at epoch=4 step=16756 2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - ***** Evaluation report ***** 2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Output path (short): tmp/gpt2_medium_for_source_code_code_generate 2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Early stop on: perplexity 2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Early stop count = 0/20 2020-06-12 22:37:19,119 - crisis_transformers.trainer - INFO - Eval steps = 400 or (iterations = 400) 2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Best score (perplexity) = -1.1139332056045532 2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Gradient Accumulation steps = 1 2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Num of training examples (actually no. of iterations per epoch for Iterable Dataset) = 69642 2020-06-12 22:37:19,120 - crisis_transformers.trainer - INFO - Num of development examples (actually no. of iterations per epoch for Iterable Dataset) = 7738 2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Time spent since last evaluation = 0h 7m 14s 2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Epoch = 5/16 2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Steps = 86400/278576 2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - Instantaneous batch size per GPU = 2 and n_gpu = 2 so the input batch size = 4 2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - dev_loss = 0.107897 || dev_eval_scores = {'perplexity': 1.1139332056045532} 2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - train_loss = 0.14752434194087982 2020-06-12 22:37:19,121 - crisis_transformers.trainer - INFO - ********************************************