diff --git "a/wandb/run-20220228_093705-yn2gmwrw/files/output.log" "b/wandb/run-20220228_093705-yn2gmwrw/files/output.log" --- "a/wandb/run-20220228_093705-yn2gmwrw/files/output.log" +++ "b/wandb/run-20220228_093705-yn2gmwrw/files/output.log" @@ -4781,3 +4781,716 @@ 02/28/2022 11:44:57 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow [INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|█████████████████████████████████████████████████████████████▍ | 1501/1784 [2:10:02<26:22:32, 335.52s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|█████████████████████████████████████████████████████████████▍ | 1501/1784 [2:10:02<26:22:32, 335.52s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2198, 'learning_rate': 2.2352024922118382e-06, 'epoch': 0.84} + 84%|█████████████████████████████████████████████████████████████▍ | 1502/1784 [2:10:05<18:29:21, 236.03s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|█████████████████████████████████████████████████████████████▍ | 1502/1784 [2:10:05<18:29:21, 236.03s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1961, 'learning_rate': 2.2274143302180688e-06, 'epoch': 0.84} + 84%|█████████████████████████████████████████████████████████████▌ | 1503/1784 [2:10:09<12:59:12, 166.38s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|█████████████████████████████████████████████████████████████▌ | 1503/1784 [2:10:09<12:59:12, 166.38s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1441, 'learning_rate': 2.2196261682242994e-06, 'epoch': 0.84} + 84%|██████████████████████████████████████████████████████████████▍ | 1504/1784 [2:10:13<9:08:51, 117.61s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|██████████████████████████████████████████████████████████████▍ | 1504/1784 [2:10:13<9:08:51, 117.61s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9946, 'learning_rate': 2.21183800623053e-06, 'epoch': 0.84} + 84%|███████████████████████████████████████████████████████████████▎ | 1505/1784 [2:10:17<6:28:12, 83.49s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|███████████████████████████████████████████████████████████████▎ | 1505/1784 [2:10:17<6:28:12, 83.49s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.241, 'learning_rate': 2.20404984423676e-06, 'epoch': 0.84} + 84%|███████████████████████████████████████████████████████████████▎ | 1506/1784 [2:10:21<4:35:59, 59.57s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|███████████████████████████████████████████████████████████████▎ | 1506/1784 [2:10:21<4:35:59, 59.57s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1064, 'learning_rate': 2.1962616822429906e-06, 'epoch': 0.84} + 84%|███████████████████████████████████████████████████████████████▎ | 1507/1784 [2:10:24<3:17:41, 42.82s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 84%|███████████████████████████████████████████████████████████████▎ | 1507/1784 [2:10:24<3:17:41, 42.82s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:47:35,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:47:35,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1023, 'learning_rate': 2.1806853582554518e-06, 'epoch': 0.85} + 85%|███████████████████████████████████████████████████████████████▍ | 1509/1784 [2:10:32<1:44:58, 22.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|███████████████████████████████████████████████████████████████▍ | 1509/1784 [2:10:32<1:44:58, 22.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1471, 'learning_rate': 2.1728971962616823e-06, 'epoch': 0.85} + 85%|███████████████████████████████████████████████████████████████▍ | 1510/1784 [2:10:36<1:18:29, 17.19s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|███████████████████████████████████████████████████████████████▍ | 1510/1784 [2:10:36<1:18:29, 17.19s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1109, 'learning_rate': 2.165109034267913e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▏ | 1511/1784 [2:10:40<59:54, 13.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▏ | 1511/1784 [2:10:40<59:54, 13.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.933, 'learning_rate': 2.1573208722741435e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▎ | 1512/1784 [2:10:43<46:37, 10.29s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▎ | 1512/1784 [2:10:43<46:37, 10.29s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2576, 'learning_rate': 2.149532710280374e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▎ | 1512/1784 [2:10:43<46:37, 10.29s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▎ | 1513/1784 [2:10:47<37:20, 8.27s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▎ | 1513/1784 [2:10:47<37:20, 8.27s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:47:57,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:47:57,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:47:57,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▍ | 1515/1784 [2:10:54<26:19, 5.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▍ | 1515/1784 [2:10:54<26:19, 5.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▍ | 1515/1784 [2:10:54<26:19, 5.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▍ | 1516/1784 [2:10:57<23:06, 5.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▍ | 1516/1784 [2:10:57<23:06, 5.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▍ | 1517/1784 [2:11:01<20:54, 4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▍ | 1517/1784 [2:11:01<20:54, 4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▌ | 1518/1784 [2:11:05<19:15, 4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▌ | 1518/1784 [2:11:05<19:15, 4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2241, 'learning_rate': 2.102803738317757e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▌ | 1519/1784 [2:11:08<18:07, 4.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▌ | 1519/1784 [2:11:08<18:07, 4.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3095, 'learning_rate': 2.0950155763239876e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▌ | 1520/1784 [2:11:12<17:28, 3.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▌ | 1520/1784 [2:11:12<17:28, 3.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:22,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:22,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0323, 'learning_rate': 2.0794392523364487e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▋ | 1522/1784 [2:11:19<16:04, 3.68s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▋ | 1522/1784 [2:11:19<16:04, 3.68s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0204, 'learning_rate': 2.0716510903426793e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▋ | 1523/1784 [2:11:22<15:37, 3.59s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▋ | 1523/1784 [2:11:22<15:37, 3.59s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:32,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:32,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0068, 'learning_rate': 2.0560747663551404e-06, 'epoch': 0.85} + 85%|█████████████████████████████████████████████████████████████████▊ | 1525/1784 [2:11:29<15:09, 3.51s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 85%|█████████████████████████████████████████████████████████████████▊ | 1525/1784 [2:11:29<15:09, 3.51s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1383, 'learning_rate': 2.048286604361371e-06, 'epoch': 0.85} + 86%|█████████████████████████████████████████████████████████████████▊ | 1526/1784 [2:11:32<14:54, 3.47s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|█████████████████████████████████████████████████████████████████▊ | 1526/1784 [2:11:32<14:54, 3.47s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:42,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:42,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0699, 'learning_rate': 2.0327102803738317e-06, 'epoch': 0.86} + 86%|█████████████████████████████████████████████████████████████████▉ | 1528/1784 [2:11:39<14:33, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|█████████████████████████████████████████████████████████████████▉ | 1528/1784 [2:11:39<14:33, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1097, 'learning_rate': 2.0249221183800623e-06, 'epoch': 0.86} + 86%|█████████████████████████████████████████████████████████████████▉ | 1529/1784 [2:11:42<14:30, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|█████████████████████████████████████████████████████████████████▉ | 1529/1784 [2:11:42<14:30, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:52,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:48:52,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9695, 'learning_rate': 2.0093457943925234e-06, 'epoch': 0.86} + 86%|██████████████████████████████████████████████████████████████████ | 1531/1784 [2:11:49<14:06, 3.35s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████ | 1531/1784 [2:11:49<14:06, 3.35s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9809, 'learning_rate': 2.001557632398754e-06, 'epoch': 0.86} + 86%|██████████████████████████████████████████████████████████████████ | 1531/1784 [2:11:49<14:06, 3.35s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████ | 1532/1784 [2:11:52<13:51, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:01,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▏ | 1533/1784 [2:11:55<13:36, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:01,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▏ | 1533/1784 [2:11:55<13:36, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:01,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9909, 'learning_rate': 1.985981308411215e-06, 'epoch': 0.86} + 86%|██████████████████████████████████████████████████████████████████▏ | 1534/1784 [2:11:58<13:20, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▏ | 1534/1784 [2:11:58<13:20, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▎ | 1535/1784 [2:12:01<13:04, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▎ | 1535/1784 [2:12:01<13:04, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0765, 'learning_rate': 1.9704049844236762e-06, 'epoch': 0.86} + 86%|██████████████████████████████████████████████████████████████████▎ | 1536/1784 [2:12:04<12:49, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▎ | 1536/1784 [2:12:04<12:49, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▎ | 1537/1784 [2:12:07<12:33, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▎ | 1537/1784 [2:12:07<12:33, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.982, 'learning_rate': 1.9548286604361374e-06, 'epoch': 0.86} + 86%|██████████████████████████████████████████████████████████████████▎ | 1537/1784 [2:12:07<12:33, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▍ | 1538/1784 [2:12:10<12:19, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▍ | 1539/1784 [2:12:13<12:03, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▍ | 1539/1784 [2:12:13<12:03, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:23,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:23,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1731, 'learning_rate': 1.9314641744548286e-06, 'epoch': 0.86} +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:23,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████���███▌ | 1541/1784 [2:12:18<11:21, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:27,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▌ | 1542/1784 [2:12:21<11:05, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▌ | 1542/1784 [2:12:21<11:05, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▌ | 1543/1784 [2:12:23<10:42, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 86%|██████████████████████████████████████████████████████████████████▌ | 1543/1784 [2:12:23<10:42, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:33,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:33,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:34,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:34,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:36,865 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:36,865 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:38,503 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:38,503 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3333, 'learning_rate': 1.8769470404984424e-06, 'epoch': 0.87} +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:41,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:41,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:43,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:43,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0159, 'learning_rate': 1.8535825545171341e-06, 'epoch': 0.87} +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:47,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:49:47,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2975, 'learning_rate': 1.8457943925233645e-06, 'epoch': 0.87} + 87%|██████████████████████████████████████████████████████████████████▉ | 1552/1784 [2:12:44<11:12, 2.90s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|██████████████████████████████████████████████████████████████████▉ | 1552/1784 [2:12:44<11:12, 2.90s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1462, 'learning_rate': 1.838006230529595e-06, 'epoch': 0.87} + 87%|███████████████████████████████████████████████████████████████████ | 1553/1784 [2:12:48<12:13, 3.18s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████ | 1553/1784 [2:12:48<12:13, 3.18s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0861, 'learning_rate': 1.8302180685358256e-06, 'epoch': 0.87} + 87%|███████████████████████████████████████████████████████████████████ | 1554/1784 [2:12:52<12:49, 3.35s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████ | 1554/1784 [2:12:52<12:49, 3.35s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9723, 'learning_rate': 1.8224299065420562e-06, 'epoch': 0.87} + 87%|███████████████████████████████████████████████████████████████████ | 1555/1784 [2:12:55<13:09, 3.45s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████ | 1555/1784 [2:12:55<13:09, 3.45s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2419, 'learning_rate': 1.8146417445482867e-06, 'epoch': 0.87} + 87%|███████████████████████████████████████████████████████████████████▏ | 1556/1784 [2:12:59<13:23, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|██████████████████████████████████████████████████████��████████████▏ | 1556/1784 [2:12:59<13:23, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████▏ | 1557/1784 [2:13:03<13:25, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████▏ | 1557/1784 [2:13:03<13:25, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.223, 'learning_rate': 1.7990654205607477e-06, 'epoch': 0.87} + 87%|███████████████████████████████████████████████████████████████████▏ | 1558/1784 [2:13:06<13:27, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████▏ | 1558/1784 [2:13:06<13:27, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2847, 'learning_rate': 1.7912772585669782e-06, 'epoch': 0.87} + 87%|███████████████████████████████████████████████████████████████████▎ | 1559/1784 [2:13:10<13:23, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████▎ | 1559/1784 [2:13:10<13:23, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9438, 'learning_rate': 1.7834890965732088e-06, 'epoch': 0.87} + 87%|███████████████████████████████████████████████████████████████████▎ | 1559/1784 [2:13:10<13:23, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████▎ | 1560/1784 [2:13:13<13:20, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 87%|███████████████████████████████████████████████████████████████████▎ | 1560/1784 [2:13:13<13:20, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▍ | 1561/1784 [2:13:17<13:17, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▍ | 1561/1784 [2:13:17<13:17, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████��███████████████████████████████████████████████▍ | 1561/1784 [2:13:17<13:17, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▍ | 1562/1784 [2:13:20<13:10, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▍ | 1562/1784 [2:13:20<13:10, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▍ | 1562/1784 [2:13:20<13:10, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▍ | 1563/1784 [2:13:24<13:04, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:34,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:34,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0038, 'learning_rate': 1.7445482866043614e-06, 'epoch': 0.88} +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:34,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▌ | 1565/1784 [2:13:31<12:54, 3.54s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▌ | 1565/1784 [2:13:31<12:54, 3.54s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▌ | 1565/1784 [2:13:31<12:54, 3.54s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▌ | 1566/1784 [2:13:34<12:46, 3.52s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:45,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:45,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.85, 'learning_rate': 1.7211838006230531e-06, 'epoch': 0.88} +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:45,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▋ | 1568/1784 [2:13:41<12:34, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▋ | 1568/1784 [2:13:41<12:34, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▋ | 1568/1784 [2:13:41<12:34, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▋ | 1569/1784 [2:13:45<12:26, 3.47s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▋ | 1569/1784 [2:13:45<12:26, 3.47s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▋ | 1569/1784 [2:13:45<12:26, 3.47s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▊ | 1570/1784 [2:13:48<12:26, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▊ | 1570/1784 [2:13:48<12:26, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:59,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:59,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:50:59,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▊ | 1572/1784 [2:13:55<12:10, 3.44s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████���███████████████████████████▊ | 1572/1784 [2:13:55<12:10, 3.44s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▉ | 1573/1784 [2:13:59<12:03, 3.43s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▉ | 1573/1784 [2:13:59<12:03, 3.43s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:09,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:09,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9676, 'learning_rate': 1.6666666666666667e-06, 'epoch': 0.88} + 88%|███████████████████████████████████████████████████████████████████▉ | 1575/1784 [2:14:05<11:46, 3.38s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|███████████████████████████████████████████████████████████████████▉ | 1575/1784 [2:14:05<11:46, 3.38s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:15,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:15,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:15,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|████████████████████████████████████████████████████████████████████ | 1577/1784 [2:14:12<11:26, 3.32s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|████████████████████████████████████████████████████████████████████ | 1577/1784 [2:14:12<11:26, 3.32s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9374, 'learning_rate': 1.6433021806853584e-06, 'epoch': 0.88} + 88%|████████████████████████████████████████████████████████████████████ | 1578/1784 [2:14:15<11:19, 3.30s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 88%|████████████████████████████████████████████████████████████████████ | 1578/1784 [2:14:15<11:19, 3.30s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:25,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:25,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.136, 'learning_rate': 1.6277258566978193e-06, 'epoch': 0.89} + 89%|████████████████████████████████████████████████████████████████████▏ | 1580/1784 [2:14:21<10:59, 3.23s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▏ | 1580/1784 [2:14:21<10:59, 3.23s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:31,827 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:31,827 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2275, 'learning_rate': 1.6121495327102804e-06, 'epoch': 0.89} + 89%|████████████████████████████████████████████████████████████████████▎ | 1582/1784 [2:14:28<10:43, 3.18s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▎ | 1582/1784 [2:14:28<10:43, 3.18s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:38,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:38,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1318, 'learning_rate': 1.5965732087227416e-06, 'epoch': 0.89} + 89%|████████████████████████████████████████████████████████████████████▎ | 1584/1784 [2:14:34<10:22, 3.11s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▎ | 1584/1784 [2:14:34<10:22, 3.11s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:44,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:44,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0312, 'learning_rate': 1.5809968847352025e-06, 'epoch': 0.89} + 89%|████████████████████████████████████████████████████████████████████▍ | 1586/1784 [2:14:40<10:01, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▍ | 1586/1784 [2:14:40<10:01, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▍ | 1587/1784 [2:14:43<09:48, 2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▍ | 1587/1784 [2:14:43<09:48, 2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:52,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:52,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:51:52,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▌ | 1589/1784 [2:14:48<09:18, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▌ | 1589/1784 [2:14:48<09:18, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▋ | 1590/1784 [2:14:51<09:04, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▋ | 1590/1784 [2:14:51<09:04, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9825, 'learning_rate': 1.542056074766355e-06, 'epoch': 0.89} +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:00,632 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:03,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:03,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0182, 'learning_rate': 1.5264797507788162e-06, 'epoch': 0.89} +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:03,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▊ | 1593/1784 [2:14:58<08:09, 2.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:06,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▊ | 1593/1784 [2:14:58<08:09, 2.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:06,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▊ | 1594/1784 [2:15:00<07:45, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:08,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▊ | 1594/1784 [2:15:00<07:45, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:08,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▊ | 1595/1784 [2:15:02<07:14, 2.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:10,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▊ | 1595/1784 [2:15:02<07:14, 2.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:10,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▉ | 1596/1784 [2:15:04<06:42, 2.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:12,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 89%|████████████████████████████████████████████████████████████████████▉ | 1596/1784 [2:15:04<06:42, 2.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:12,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2711, 'learning_rate': 1.4875389408099689e-06, 'epoch': 0.9} + 90%|████████████████████████████████████████████████████████████████████▉ | 1598/1784 [2:15:07<05:38, 1.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:13,746 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|████████████████████████████████████████████████████████████████████▉ | 1598/1784 [2:15:07<05:38, 1.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:13,746 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1599/1784 [2:15:08<05:09, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:16,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1599/1784 [2:15:08<05:09, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:16,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1600/1784 [2:15:10<05:10, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:16,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1600/1784 [2:15:10<05:10, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1600/1784 [2:15:10<05:10, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1601/1784 [2:15:14<07:21, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1601/1784 [2:15:14<07:21, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████ | 1601/1784 [2:15:14<07:21, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▏ | 1602/1784 [2:15:18<08:31, 2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▏ | 1602/1784 [2:15:18<08:31, 2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▏ | 1603/1784 [2:15:22<09:15, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▏ | 1603/1784 [2:15:22<09:15, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▏ | 1603/1784 [2:15:22<09:15, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|████████████████████████████████████████████████████████���████████████▏ | 1604/1784 [2:15:25<09:46, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▏ | 1604/1784 [2:15:25<09:46, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▏ | 1604/1784 [2:15:25<09:46, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▎ | 1605/1784 [2:15:29<10:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▎ | 1605/1784 [2:15:29<10:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▎ | 1605/1784 [2:15:29<10:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▎ | 1606/1784 [2:15:33<10:13, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:43,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:43,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0761, 'learning_rate': 1.4096573208722741e-06, 'epoch': 0.9} +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:43,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▍ | 1608/1784 [2:15:40<10:16, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▍ | 1608/1784 [2:15:40<10:16, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▍ | 1608/1784 [2:15:40<10:16, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|██████████████████████████████████████████████████████���██████████████▍ | 1609/1784 [2:15:43<10:13, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▍ | 1609/1784 [2:15:43<10:13, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▍ | 1609/1784 [2:15:43<10:13, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▍ | 1610/1784 [2:15:47<10:11, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▍ | 1610/1784 [2:15:47<10:11, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:57,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:57,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:52:57,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▌ | 1612/1784 [2:15:54<10:02, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▌ | 1612/1784 [2:15:54<10:02, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▌ | 1612/1784 [2:15:54<10:02, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 90%|█████████████████████████████████████████████████████████████████████▌ | 1613/1784 [2:15:57<10:00, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:08,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:08,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9159, 'learning_rate': 1.3551401869158879e-06, 'epoch': 0.9} +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:08,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▋ | 1615/1784 [2:16:04<09:45, 3.46s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▋ | 1615/1784 [2:16:04<09:45, 3.46s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▋ | 1615/1784 [2:16:04<09:45, 3.46s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▋ | 1616/1784 [2:16:08<09:39, 3.45s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▋ | 1616/1784 [2:16:08<09:39, 3.45s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▋ | 1616/1784 [2:16:08<09:39, 3.45s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▊ | 1617/1784 [2:16:11<09:33, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▊ | 1617/1784 [2:16:11<09:33, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▊ | 1618/1784 [2:16:14<09:26, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▊ | 1618/1784 [2:16:14<09:26, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▊ | 1618/1784 [2:16:14<09:26, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▉ | 1619/1784 [2:16:18<09:20, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:28,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:28,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1666, 'learning_rate': 1.308411214953271e-06, 'epoch': 0.91} +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:28,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▉ | 1621/1784 [2:16:24<09:03, 3.34s/it]g-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▉ | 1621/1784 [2:16:24<09:03, 3.34s/it]g-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|█████████████████████████████████████████████████████████████████████▉ | 1621/1784 [2:16:24<09:03, 3.34s/it]g-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████ | 1622/1784 [2:16:28<09:00, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████ | 1623/1784 [2:16:31<08:57, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████ | 1623/1784 [2:16:31<08:57, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9593, 'learning_rate': 1.2850467289719625e-06, 'epoch': 0.91} + 91%|██████████████████████████████████████████████████████████████████████ | 1623/1784 [2:16:31<08:57, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████ | 1624/1784 [2:16:34<08:51, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:44,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:44,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1008, 'learning_rate': 1.2694704049844237e-06, 'epoch': 0.91} +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:44,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▏ | 1626/1784 [2:16:41<08:43, 3.32s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:51,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:53:51,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1622, 'learning_rate': 1.2538940809968846e-06, 'epoch': 0.91} + 91%|██████████████████████████████████████████████████████████████████████▎ | 1628/1784 [2:16:47<08:28, 3.26s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▎ | 1628/1784 [2:16:47<08:28, 3.26s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9802, 'learning_rate': 1.2461059190031154e-06, 'epoch': 0.91} + 91%|██████████████████████████████████████████████████████████████████████▎ | 1628/1784 [2:16:47<08:28, 3.26s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▎ | 1629/1784 [2:16:50<08:17, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▎ | 1630/1784 [2:16:53<08:09, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▎ | 1630/1784 [2:16:53<08:09, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9442, 'learning_rate': 1.2305295950155765e-06, 'epoch': 0.91} + 91%|██████████████████████████████████████████████████████████████████████▎ | 1630/1784 [2:16:53<08:09, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▍ | 1631/1784 [2:16:57<08:06, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▍ | 1632/1784 [2:17:00<07:59, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 91%|██████████████████████████████████████████████████████████████████████▍ | 1632/1784 [2:17:00<07:59, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.159, 'learning_rate': 1.2149532710280374e-06, 'epoch': 0.91} + 91%|██████████████████████████████████████████████████████████████████████▍ | 1632/1784 [2:17:00<07:59, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▍ | 1633/1784 [2:17:03<07:52, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▌ | 1634/1784 [2:17:06<07:42, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▌ | 1634/1784 [2:17:06<07:42, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:15,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:15,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1648, 'learning_rate': 1.1915887850467291e-06, 'epoch': 0.92} +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:15,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▌ | 1636/1784 [2:17:11<07:18, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▋ | 1637/1784 [2:17:14<07:08, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▋ | 1637/1784 [2:17:14<07:08, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1987, 'learning_rate': 1.1682242990654206e-06, 'epoch': 0.92} +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▋ | 1639/1784 [2:17:20<06:43, 2.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▊ | 1640/1784 [2:17:22<06:27, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 92%|██████████████████████████████████████████████████████████████████████▊ | 1640/1784 [2:17:22<06:27, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:31,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:31,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:33,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:33,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:36,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:36,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:37,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:37,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:39,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:39,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:41,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:41,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5605, 'learning_rate': 1.105919003115265e-06, 'epoch': 0.92} +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:43,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:43,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4507, 'learning_rate': 1.0825545171339565e-06, 'epoch': 0.92} +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:46,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:46,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:46,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:50,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:54,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:54:54,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1179, 'learning_rate': 1.059190031152648e-06, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▎ | 1653/1784 [2:17:51<06:36, 3.03s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▎ | 1653/1784 [2:17:51<06:36, 3.03s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0829, 'learning_rate': 1.0514018691588785e-06, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▍ | 1654/1784 [2:17:55<06:57, 3.21s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▍ | 1654/1784 [2:17:55<06:57, 3.21s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0633, 'learning_rate': 1.043613707165109e-06, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▍ | 1655/1784 [2:17:58<07:11, 3.34s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▍ | 1655/1784 [2:17:58<07:11, 3.34s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9353, 'learning_rate': 1.0358255451713396e-06, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▍ | 1655/1784 [2:17:58<07:11, 3.34s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▍ | 1656/1784 [2:18:02<07:15, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▌ | 1657/1784 [2:18:05<07:20, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▌ | 1657/1784 [2:18:05<07:20, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2134, 'learning_rate': 1.0202492211838008e-06, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▌ | 1658/1784 [2:18:09<07:21, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▌ | 1658/1784 [2:18:09<07:21, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1343, 'learning_rate': 1.0124610591900311e-06, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▌ | 1659/1784 [2:18:13<07:20, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▌ | 1659/1784 [2:18:13<07:20, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:55:23,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:55:23,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1671, 'learning_rate': 9.968847352024923e-07, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▋ | 1661/1784 [2:18:20<07:14, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▋ | 1661/1784 [2:18:20<07:14, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2345, 'learning_rate': 9.890965732087228e-07, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▋ | 1662/1784 [2:18:23<07:10, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▋ | 1662/1784 [2:18:23<07:10, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2383, 'learning_rate': 9.813084112149534e-07, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▋ | 1662/1784 [2:18:23<07:10, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▊ | 1663/1784 [2:18:27<07:05, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▊ | 1664/1784 [2:18:30<07:01, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▊ | 1664/1784 [2:18:30<07:01, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1784, 'learning_rate': 9.657320872274143e-07, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▊ | 1665/1784 [2:18:34<06:55, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▊ | 1665/1784 [2:18:34<06:55, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9555, 'learning_rate': 9.579439252336449e-07, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▊ | 1665/1784 [2:18:34<06:55, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▉ | 1666/1784 [2:18:37<06:50, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:55:47,801 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:55:47,801 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9806, 'learning_rate': 9.423676012461059e-07, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▉ | 1668/1784 [2:18:44<06:39, 3.44s/it]g-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 93%|███████████████████████████████████████████████████████████████████████▉ | 1668/1784 [2:18:44<06:39, 3.44s/it]g-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0233, 'learning_rate': 9.345794392523365e-07, 'epoch': 0.93} + 93%|███████████████████████████████████████████████████████████████████████▉ | 1668/1784 [2:18:44<06:39, 3.44s/it]g-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████ | 1669/1784 [2:18:47<06:33, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████ | 1670/1784 [2:18:51<06:27, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████ | 1670/1784 [2:18:51<06:27, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0175, 'learning_rate': 9.190031152647975e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████ | 1671/1784 [2:18:54<06:22, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████ | 1671/1784 [2:18:54<06:22, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:04,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:04,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1891, 'learning_rate': 9.034267912772586e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████▏ | 1673/1784 [2:19:01<06:10, 3.34s/it]g-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▏ | 1673/1784 [2:19:01<06:10, 3.34s/it]g-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0017, 'learning_rate': 8.956386292834891e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████▏ | 1673/1784 [2:19:01<06:10, 3.34s/it]g-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▎ | 1674/1784 [2:19:04<06:04, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▎ | 1675/1784 [2:19:07<05:58, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▎ | 1675/1784 [2:19:07<05:58, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3383, 'learning_rate': 8.800623052959501e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████▎ | 1675/1784 [2:19:07<05:58, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▎ | 1676/1784 [2:19:10<05:51, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:20,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:20,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9002, 'learning_rate': 8.644859813084113e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████▍ | 1678/1784 [2:19:17<05:41, 3.22s/it]g-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▍ | 1678/1784 [2:19:17<05:41, 3.22s/it]g-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1325, 'learning_rate': 8.566978193146417e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████▍ | 1678/1784 [2:19:17<05:41, 3.22s/it]g-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▍ | 1679/1784 [2:19:20<05:35, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▌ | 1680/1784 [2:19:23<05:29, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▌ | 1680/1784 [2:19:23<05:29, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2532, 'learning_rate': 8.411214953271029e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████▌ | 1680/1784 [2:19:23<05:29, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▌ | 1681/1784 [2:19:26<05:25, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▌ | 1682/1784 [2:19:29<05:20, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▌ | 1682/1784 [2:19:29<05:20, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.8529, 'learning_rate': 8.255451713395639e-07, 'epoch': 0.94} + 94%|████████████████████████████████████████████████████████████████████████▌ | 1682/1784 [2:19:29<05:20, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▋ | 1683/1784 [2:19:32<05:13, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▋ | 1684/1784 [2:19:35<05:06, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 94%|████████████████████████████████████████████████████████████████████████▋ | 1684/1784 [2:19:35<05:06, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:45,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:45,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2004, 'learning_rate': 8.021806853582555e-07, 'epoch': 0.94} + 95%|████████████████████████████████████████████████████████████████████████▊ | 1686/1784 [2:19:41<04:51, 2.98s/it]g-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 95%|████████████████████████████████████████████████████████████████████████▊ | 1686/1784 [2:19:41<04:51, 2.98s/it]g-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:51,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:51,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:53,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:56:53,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2725, 'learning_rate': 7.788161993769471e-07, 'epoch': 0.95} + 95%|████████████████████████████████████████████████████████████████████████▉ | 1689/1784 [2:19:49<04:28, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 95%|████████████████████████████████████████████████████████████████████████▉ | 1689/1784 [2:19:49<04:28, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 95%|████████████████████████████████████████████████████████████████████████▉ | 1690/1784 [2:19:52<04:16, 2.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 95%|████████████████████████████████████████████████████████████████████████▉ | 1690/1784 [2:19:52<04:16, 2.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:01,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:01,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:03,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:03,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:05,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:05,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:07,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:07,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:09,446 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:09,446 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:11,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:11,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:13,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:13,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2427, 'learning_rate': 7.087227414330218e-07, 'epoch': 0.95} +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:15,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:15,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:16,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:16,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5237, 'learning_rate': 6.853582554517134e-07, 'epoch': 0.95} +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:20,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:20,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2145, 'learning_rate': 6.775700934579439e-07, 'epoch': 0.95} + 95%|███████████████████████████████████████████████████████████████████��█████▍ | 1702/1784 [2:20:17<03:45, 2.75s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 95%|█████████████████████████████████████████████████████████████████████████▍ | 1702/1784 [2:20:17<03:45, 2.75s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9877, 'learning_rate': 6.697819314641744e-07, 'epoch': 0.95} + 95%|█████████████████████████████████████████████████████████████████████████▍ | 1702/1784 [2:20:17<03:45, 2.75s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 95%|█████████████████████████████████████████████████████████████████████████▌ | 1703/1784 [2:20:21<04:06, 3.05s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:32,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:32,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9723, 'learning_rate': 6.542056074766355e-07, 'epoch': 0.96} + 96%|█████████████████████████████████████████████████████████████████████████▌ | 1705/1784 [2:20:28<04:26, 3.38s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▌ | 1705/1784 [2:20:28<04:26, 3.38s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.005, 'learning_rate': 6.46417445482866e-07, 'epoch': 0.96} + 96%|█████████████████████████████████████████████████████████████████████████▋ | 1706/1784 [2:20:32<04:30, 3.47s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▋ | 1706/1784 [2:20:32<04:30, 3.47s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9488, 'learning_rate': 6.386292834890966e-07, 'epoch': 0.96} + 96%|█████████████████████████████████████████████████████████████████████████▋ | 1707/1784 [2:20:36<04:31, 3.53s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▋ | 1707/1784 [2:20:36<04:31, 3.53s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9929, 'learning_rate': 6.308411214953271e-07, 'epoch': 0.96} + 96%|█████████████████████████████████████████████████████████████████████████▋ | 1707/1784 [2:20:36<04:31, 3.53s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▋ | 1708/1784 [2:20:39<04:30, 3.55s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:50,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:57:50,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.284, 'learning_rate': 6.152647975077883e-07, 'epoch': 0.96} + 96%|█████████████████████████████████████████████████████████████████████████▊ | 1710/1784 [2:20:47<04:24, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▊ | 1710/1784 [2:20:47<04:24, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2583, 'learning_rate': 6.074766355140187e-07, 'epoch': 0.96} + 96%|█████████████████████████████████████████████████████████████████████████▊ | 1711/1784 [2:20:50<04:20, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▊ | 1711/1784 [2:20:50<04:20, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9819, 'learning_rate': 5.996884735202493e-07, 'epoch': 0.96} + 96%|█████████████████████████████████████████████████████████████████████████▊ | 1711/1784 [2:20:50<04:20, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▉ | 1712/1784 [2:20:54<04:15, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▉ | 1713/1784 [2:20:57<04:11, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▉ | 1713/1784 [2:20:57<04:11, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3626, 'learning_rate': 5.841121495327103e-07, 'epoch': 0.96} + 96%|████████████████████████████���████████████████████████████████████████████▉ | 1714/1784 [2:21:01<04:07, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|█████████████████████████████████████████████████████████████████████████▉ | 1714/1784 [2:21:01<04:07, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0682, 'learning_rate': 5.763239875389409e-07, 'epoch': 0.96} + 96%|██████████████████████████████████████████████████████████████████████████ | 1715/1784 [2:21:04<04:02, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|██████████████████████████████████████████████████████████████████████████ | 1715/1784 [2:21:04<04:02, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:14,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:14,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1014, 'learning_rate': 5.607476635514019e-07, 'epoch': 0.96} + 96%|██████████████████████████████████████████████████████████████████████████ | 1717/1784 [2:21:11<03:52, 3.47s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|██████████████████████████████████████████████████████████████████████████ | 1717/1784 [2:21:11<03:52, 3.47s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4454, 'learning_rate': 5.529595015576325e-07, 'epoch': 0.96} + 96%|██████████████████████████████████████████████████████████████████████████▏ | 1718/1784 [2:21:14<03:47, 3.45s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|██████████████████████████████████████████████████████████████████████████▏ | 1718/1784 [2:21:14<03:47, 3.45s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:25,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:25,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.8065, 'learning_rate': 5.373831775700935e-07, 'epoch': 0.96} + 96%|██████████████████████████████████████████████████████████████████████████▏ | 1720/1784 [2:21:21<03:39, 3.43s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|██████████████████████████████████████████████████████████████████████████▏ | 1720/1784 [2:21:21<03:39, 3.43s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9721, 'learning_rate': 5.29595015576324e-07, 'epoch': 0.96} + 96%|██████████████████████████████████████████████████████████████████████████▎ | 1721/1784 [2:21:25<03:34, 3.41s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 96%|██████████████████████████████████████████████████████████████████████████▎ | 1721/1784 [2:21:25<03:34, 3.41s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:35,314 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:35,314 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0803, 'learning_rate': 5.140186915887851e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▎ | 1723/1784 [2:21:31<03:25, 3.36s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▎ | 1723/1784 [2:21:31<03:25, 3.36s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2579, 'learning_rate': 5.062305295950156e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▎ | 1723/1784 [2:21:31<03:25, 3.36s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▍ | 1724/1784 [2:21:35<03:20, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▍ | 1725/1784 [2:21:38<03:15, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▍ | 1725/1784 [2:21:38<03:15, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1518, 'learning_rate': 4.906542056074767e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▍ | 1725/1784 [2:21:38<03:15, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▍ | 1726/1784 [2:21:41<03:09, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▌ | 1727/1784 [2:21:44<03:06, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▌ | 1727/1784 [2:21:44<03:06, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1883, 'learning_rate': 4.7507788161993773e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▌ | 1728/1784 [2:21:47<03:02, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▌ | 1728/1784 [2:21:47<03:02, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:57,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:58:57,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3411, 'learning_rate': 4.5950155763239876e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▋ | 1730/1784 [2:21:54<02:52, 3.20s/it]g-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▋ | 1730/1784 [2:21:54<02:52, 3.20s/it]g-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1567, 'learning_rate': 4.517133956386293e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▋ | 1730/1784 [2:21:54<02:52, 3.20s/it]g-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▋ | 1731/1784 [2:21:57<02:48, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▊ | 1732/1784 [2:22:00<02:44, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▊ | 1732/1784 [2:22:00<02:44, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0856, 'learning_rate': 4.3613707165109035e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▊ | 1732/1784 [2:22:00<02:44, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▊ | 1733/1784 [2:22:03<02:39, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▊ | 1734/1784 [2:22:06<02:35, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▊ | 1734/1784 [2:22:06<02:35, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2605, 'learning_rate': 4.2056074766355143e-07, 'epoch': 0.97} + 97%|██████████████████████████████████████████████████████████████████████████▊ | 1734/1784 [2:22:06<02:35, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▉ | 1735/1784 [2:22:09<02:30, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▉ | 1736/1784 [2:22:12<02:24, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|██████████████████████████████████████████████████████████████████████████▉ | 1736/1784 [2:22:12<02:24, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:22,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:22,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.8847, 'learning_rate': 3.97196261682243e-07, 'epoch': 0.97} + 97%|███████████████████████████████████████████████████████████████████████████ | 1738/1784 [2:22:18<02:13, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|███████████████████████████████████████████████████████████████████████████ | 1738/1784 [2:22:18<02:13, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|███████████████████████████████████████████████████████████████████████████ | 1739/1784 [2:22:20<02:07, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 97%|███████████████████████████████████████████████████████████████████████████ | 1739/1784 [2:22:20<02:07, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:30,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:30,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:32,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:32,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1438, 'learning_rate': 3.660436137071651e-07, 'epoch': 0.98} +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:32,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▏ | 1742/1784 [2:22:28<01:47, 2.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:36,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▏ | 1743/1784 [2:22:30<01:40, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:38,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▏ | 1743/1784 [2:22:30<01:40, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:38,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▎ | 1744/1784 [2:22:32<01:34, 2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:40,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▎ | 1744/1784 [2:22:32<01:34, 2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:40,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████���███████████████████████████████████▎ | 1746/1784 [2:22:36<01:20, 2.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:42,184 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▎ | 1746/1784 [2:22:36<01:20, 2.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:42,184 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1639, 'learning_rate': 3.348909657320872e-07, 'epoch': 0.98} + 98%|███████████████████████████████████████████████████████████████████████████▍ | 1747/1784 [2:22:37<01:13, 1.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:43,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▍ | 1747/1784 [2:22:37<01:13, 1.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:43,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▍ | 1748/1784 [2:22:39<01:06, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:46,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▍ | 1748/1784 [2:22:39<01:06, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:46,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▍ | 1749/1784 [2:22:40<00:59, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:48,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▍ | 1749/1784 [2:22:40<00:59, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:48,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.31, 'learning_rate': 3.0373831775700936e-07, 'epoch': 0.98} + 98%|███████████████████████████████████████████████████████████████████████████▌ | 1750/1784 [2:22:42<00:57, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:48,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▌ | 1750/1784 [2:22:42<00:57, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▌ | 1750/1784 [2:22:42<00:57, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▌ | 1751/1784 [2:22:46<01:18, 2.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:57,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 11:59:57,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0883, 'learning_rate': 2.8037383177570096e-07, 'epoch': 0.98} + 98%|███████████████████████████████████████████████████████████████████████████▋ | 1753/1784 [2:22:53<01:35, 3.08s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▋ | 1753/1784 [2:22:53<01:35, 3.08s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3001, 'learning_rate': 2.7258566978193147e-07, 'epoch': 0.98} + 98%|███████████████████████████████████████████████████████████████████████████▋ | 1754/1784 [2:22:57<01:37, 3.24s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▋ | 1754/1784 [2:22:57<01:37, 3.24s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0114, 'learning_rate': 2.64797507788162e-07, 'epoch': 0.98} + 98%|███████████████████████████████████████████████████████████████████████████▋ | 1755/1784 [2:23:01<01:37, 3.37s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▋ | 1755/1784 [2:23:01<01:37, 3.37s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2639, 'learning_rate': 2.5700934579439255e-07, 'epoch': 0.98} + 98%|███████████████████████████████████████████████████████████████████████████▊ | 1756/1784 [2:23:04<01:36, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▊ | 1756/1784 [2:23:04<01:36, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▊ | 1757/1784 [2:23:08<01:34, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 98%|███████████████████████████████████████████████████████████████████████████▊ | 1757/1784 [2:23:08<01:34, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0116, 'learning_rate': 2.414330218068536e-07, 'epoch': 0.98} + 99%|███████████████████████████████████████████████████████████████████████████▉ | 1758/1784 [2:23:11<01:30, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|███████████████████████████████████████████████████████████████████████████▉ | 1758/1784 [2:23:11<01:30, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0707, 'learning_rate': 2.3364485981308412e-07, 'epoch': 0.99} + 99%|███████████████████████████████████████████████████████████████████████████▉ | 1758/1784 [2:23:11<01:30, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|███████████████████████████████████████████████████████████████████████████▉ | 1759/1784 [2:23:15<01:27, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|███████████████████████████████████████████████████████████████████████████▉ | 1759/1784 [2:23:15<01:27, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:25,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:25,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:25,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████ | 1761/1784 [2:23:22<01:19, 3.46s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████ | 1761/1784 [2:23:22<01:19, 3.46s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████ | 1762/1784 [2:23:25<01:15, 3.44s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████ | 1762/1784 [2:23:25<01:15, 3.44s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.9068, 'learning_rate': 2.0249221183800623e-07, 'epoch': 0.99} +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:35,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:35,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:35,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▏| 1764/1784 [2:23:32<01:08, 3.41s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:42,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:42,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.8871, 'learning_rate': 1.7912772585669783e-07, 'epoch': 0.99} +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:42,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:46,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:46,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:00:46,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1767/1784 [2:23:42<00:57, 3.38s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1767/1784 [2:23:42<00:57, 3.38s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1767/1784 [2:23:42<00:57, 3.38s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1768/1784 [2:23:45<00:53, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1768/1784 [2:23:45<00:53, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1769/1784 [2:23:49<00:49, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1769/1784 [2:23:49<00:49, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▎| 1769/1784 [2:23:49<00:49, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▍| 1770/1784 [2:23:52<00:46, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▍| 1770/1784 [2:23:52<00:46, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:02,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:02,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:02,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▍| 1772/1784 [2:23:58<00:38, 3.23s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▍| 1772/1784 [2:23:58<00:38, 3.23s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:08,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:08,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:08,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▌| 1774/1784 [2:24:04<00:31, 3.10s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 99%|████████████████████████████████████████████████████████████████████████████▌| 1774/1784 [2:24:04<00:31, 3.10s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:14,408 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:17,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:17,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1723, 'learning_rate': 9.345794392523364e-08, 'epoch': 1.0} +[WARNING|modeling_utils.py:388] 2022-02-28 12:01:17,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▋| 1777/1784 [2:24:13<00:20, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:21,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▋| 1777/1784 [2:24:13<00:20, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:21,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▋| 1778/1784 [2:24:15<00:16, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:23,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▋| 1778/1784 [2:24:15<00:16, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:23,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▊| 1779/1784 [2:24:17<00:12, 2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:25,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▊| 1779/1784 [2:24:17<00:12, 2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:25,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▊| 1780/1784 [2:24:19<00:09, 2.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:27,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▊| 1780/1784 [2:24:19<00:09, 2.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:27,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▉| 1782/1784 [2:24:22<00:03, 1.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:28,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▉| 1782/1784 [2:24:22<00:03, 1.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:28,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0406, 'learning_rate': 4.672897196261682e-08, 'epoch': 1.0} +100%|████████████████████████████████████████████████████████████████████████████▉| 1783/1784 [2:24:24<00:01, 1.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:31,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +100%|████████████████████████████████████████████████████████████████████████████▉| 1783/1784 [2:24:24<00:01, 1.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:31,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 3.5732, 'learning_rate': 3.1152647975077883e-08, 'epoch': 1.0} +[INFO|trainer.py:2114] 2022-02-28 12:01:32,074 >> Saving model checkpoint to ./=)█| 1784/1784 [2:24:25<00:00, 1.55s/it][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[INFO|trainer.py:2114] 2022-02-28 12:01:48,841 >> Saving model checkpoint to ./ ./pytorch_model.bin:25<00:00, 1.55s/it][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[INFO|modeling_utils.py:1081] 2022-02-28 12:02:05,435 >> Model weights saved in ./pytorch_model.bin:25<00:00, 1.55s/it][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed