diff --git "a/wandb/run-20220302_214437-2u4nhnsf/files/output.log" "b/wandb/run-20220302_214437-2u4nhnsf/files/output.log" new file mode 100644--- /dev/null +++ "b/wandb/run-20220302_214437-2u4nhnsf/files/output.log" @@ -0,0 +1,1523 @@ + + + 0%| | 0/1019 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:44:46,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 0%| | 1/1019 [00:07<2:03:59, 7.31s/it] + + 0%| | 1/1019 [00:07<2:03:59, 7.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:44:49,184 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:44:52,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 0%|▏ | 2/1019 [00:13<1:51:20, 6.57s/it] + + 0%|▏ | 2/1019 [00:13<1:51:20, 6.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:44:55,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:44:58,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 0%|▏ | 3/1019 [00:19<1:48:26, 6.40s/it] + + 0%|▏ | 3/1019 [00:19<1:48:26, 6.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:01,343 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8536, 'learning_rate': 1.2e-06, 'epoch': 0.0} +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:04,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 0%|▎ | 4/1019 [00:25<1:45:16, 6.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:07,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.654, 'learning_rate': 1.8e-06, 'epoch': 0.0} +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:10,104 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 0%|▍ | 5/1019 [00:31<1:42:46, 6.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:13,114 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9403, 'learning_rate': 2.4e-06, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:16,011 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|▍ | 6/1019 [00:37<1:41:41, 6.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:18,999 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6693, 'learning_rate': 2.9999999999999997e-06, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:21,856 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|▌ | 7/1019 [00:43<1:40:35, 5.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:24,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:27,633 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 1%|▋ | 8/1019 [00:48<1:39:28, 5.90s/it] + + 1%|▋ | 8/1019 [00:48<1:39:28, 5.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:30,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:33,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 1%|▋ | 9/1019 [00:54<1:38:13, 5.83s/it] + + 1%|▋ | 9/1019 [00:54<1:38:13, 5.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:36,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:39,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 1%|▊ | 10/1019 [01:00<1:37:47, 5.82s/it] + + 1%|▊ | 10/1019 [01:00<1:37:47, 5.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:42,023 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7137, 'learning_rate': 5.399999999999999e-06, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:44,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|▊ | 11/1019 [01:06<1:37:10, 5.78s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:47,715 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6946, 'learning_rate': 5.999999999999999e-06, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:50,513 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|▉ | 12/1019 [01:11<1:36:42, 5.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:53,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:45:56,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6247, 'learning_rate': 6.599999999999999e-06, 'epoch': 0.01} + + 1%|█ | 13/1019 [01:17<1:36:19, 5.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:45:59,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:01,853 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 1%|█ | 14/1019 [01:23<1:35:40, 5.71s/it] + + 1%|█ | 14/1019 [01:23<1:35:40, 5.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:04,741 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.456, 'learning_rate': 7.799999999999998e-06, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:07,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|█▏ | 15/1019 [01:28<1:35:00, 5.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:10,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6667, 'learning_rate': 8.4e-06, 'epoch': 0.02} +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:12,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 2%|█▏ | 16/1019 [01:34<1:34:12, 5.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:15,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:18,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▎ | 17/1019 [01:39<1:33:04, 5.57s/it] + + 2%|█▎ | 17/1019 [01:39<1:33:04, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:21,164 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6961, 'learning_rate': 9.6e-06, 'epoch': 0.02} +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:23,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 2%|█▍ | 18/1019 [01:45<1:31:58, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:26,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:29,157 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▍ | 19/1019 [01:50<1:31:08, 5.47s/it] + + 2%|█▍ | 19/1019 [01:50<1:31:08, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:31,919 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:34,546 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▌ | 20/1019 [01:55<1:30:39, 5.45s/it] + + 2%|█▌ | 20/1019 [01:55<1:30:39, 5.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:37,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:39,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▋ | 21/1019 [02:01<1:29:38, 5.39s/it] + + 2%|█▋ | 21/1019 [02:01<1:29:38, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:42,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:45,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▋ | 22/1019 [02:06<1:29:16, 5.37s/it] + + 2%|█▋ | 22/1019 [02:06<1:29:16, 5.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:47,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:50,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4451, 'learning_rate': 1.26e-05, 'epoch': 0.02} + + 2%|█▊ | 23/1019 [02:11<1:28:34, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:53,021 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:46:55,546 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▊ | 24/1019 [02:16<1:27:35, 5.28s/it] + + 2%|█▊ | 24/1019 [02:16<1:27:35, 5.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:46:58,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3828, 'learning_rate': 1.3799999999999998e-05, 'epoch': 0.02} +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:00,678 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 2%|█▉ | 25/1019 [02:21<1:26:45, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:03,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:05,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██ | 26/1019 [02:27<1:26:14, 5.21s/it] + + 3%|██ | 26/1019 [02:27<1:26:14, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:08,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:10,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██ | 27/1019 [02:32<1:25:29, 5.17s/it] + + 3%|██ | 27/1019 [02:32<1:25:29, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:13,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:15,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▏ | 28/1019 [02:37<1:24:30, 5.12s/it] + + 3%|██▏ | 28/1019 [02:37<1:24:30, 5.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:18,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:20,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▏ | 29/1019 [02:42<1:24:08, 5.10s/it] + + 3%|██▏ | 29/1019 [02:42<1:24:08, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:23,544 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:26,030 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▎ | 30/1019 [02:47<1:23:56, 5.09s/it] + + 3%|██▎ | 30/1019 [02:47<1:23:56, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:28,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:31,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▍ | 31/1019 [02:52<1:23:41, 5.08s/it] + + 3%|██▍ | 31/1019 [02:52<1:23:41, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:33,727 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:36,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▍ | 32/1019 [02:57<1:23:52, 5.10s/it] + + 3%|██▍ | 32/1019 [02:57<1:23:52, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:38,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:41,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▌ | 33/1019 [03:02<1:23:30, 5.08s/it] + + 3%|██▌ | 33/1019 [03:02<1:23:30, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:43,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:46,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▋ | 34/1019 [03:07<1:22:37, 5.03s/it] + + 3%|██▋ | 34/1019 [03:07<1:22:37, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:48,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:51,005 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▋ | 35/1019 [03:12<1:21:30, 4.97s/it] + + 3%|██▋ | 35/1019 [03:12<1:21:30, 4.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:53,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:47:55,684 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|██▊ | 36/1019 [03:16<1:19:58, 4.88s/it] + + 4%|██▊ | 36/1019 [03:16<1:19:58, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:47:58,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:00,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|██▊ | 37/1019 [03:21<1:18:24, 4.79s/it] + + 4%|██▊ | 37/1019 [03:21<1:18:24, 4.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:02,530 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:04,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|██▉ | 38/1019 [03:25<1:16:26, 4.68s/it] + + 4%|██▉ | 38/1019 [03:25<1:16:26, 4.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:06,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:08,958 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███ | 39/1019 [03:30<1:14:28, 4.56s/it] + + 4%|███ | 39/1019 [03:30<1:14:28, 4.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:11,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:13,136 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███ | 40/1019 [03:34<1:12:31, 4.45s/it] + + 4%|███ | 40/1019 [03:34<1:12:31, 4.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:15,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:17,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▏ | 41/1019 [03:38<1:09:50, 4.28s/it] + + 4%|███▏ | 41/1019 [03:38<1:09:50, 4.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:18,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3835, 'learning_rate': 2.3999999999999997e-05, 'epoch': 0.04} +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:20,709 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▎ | 42/1019 [03:41<1:06:43, 4.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:22,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:24,087 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▎ | 43/1019 [03:45<1:03:08, 3.88s/it] + + 4%|███▎ | 43/1019 [03:45<1:03:08, 3.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:25,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:27,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▍ | 44/1019 [03:48<59:13, 3.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:28,660 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4275, 'learning_rate': 2.52e-05, 'epoch': 0.04} +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:29,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▌ | 45/1019 [03:51<54:32, 3.36s/it] + 4%|███▌ | 45/1019 [03:51<54:32, 3.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:31,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:32,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|███▋ | 46/1019 [03:53<49:46, 3.07s/it] + 5%|███▋ | 46/1019 [03:53<49:46, 3.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:33,478 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:34,443 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|███▋ | 47/1019 [03:55<45:22, 2.80s/it] + 5%|███▋ | 47/1019 [03:55<45:22, 2.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:35,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:36,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|███▊ | 48/1019 [03:57<41:00, 2.53s/it] + 5%|███▊ | 48/1019 [03:57<41:00, 2.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:37,242 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:38,000 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|███▉ | 49/1019 [03:59<36:23, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:38,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 5.0704, 'learning_rate': 2.7599999999999997e-05, 'epoch': 0.05} +[WARNING|modeling_utils.py:388] 2022-03-02 21:48:39,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|███▉ | 50/1019 [04:01<35:09, 2.18s/it] + 5%|███▉ | 50/1019 [04:01<35:09, 2.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:43,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 51/1019 [04:07<56:07, 3.48s/it]g-point operations will not be computed-02 21:48:43,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 51/1019 [04:07<56:07, 3.48s/it]g-point operations will not be computed-02 21:48:43,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 51/1019 [04:07<56:07, 3.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:49,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 52/1019 [04:13<1:09:13, 4.30s/it]g-point operations will not be computed-02 21:48:49,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 52/1019 [04:13<1:09:13, 4.30s/it]g-point operations will not be computed-02 21:48:49,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 52/1019 [04:13<1:09:13, 4.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:48:55,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 53/1019 [04:19<1:17:42, 4.83s/it]g-point operations will not be computed-02 21:48:55,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 53/1019 [04:19<1:17:42, 4.83s/it]g-point operations will not be computed-02 21:48:55,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 53/1019 [04:19<1:17:42, 4.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:02,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▏ | 54/1019 [04:26<1:25:49, 5.34s/it]g-point operations will not be computed-02 21:49:02,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▏ | 54/1019 [04:26<1:25:49, 5.34s/it]g-point operations will not be computed-02 21:49:02,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▏ | 54/1019 [04:26<1:25:49, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:08,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▎ | 55/1019 [04:32<1:28:47, 5.53s/it]g-point operations will not be computed-02 21:49:08,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▎ | 55/1019 [04:32<1:28:47, 5.53s/it]g-point operations will not be computed-02 21:49:08,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▎ | 55/1019 [04:32<1:28:47, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:14,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▎ | 56/1019 [04:38<1:31:34, 5.71s/it]g-point operations will not be computed-02 21:49:14,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▎ | 56/1019 [04:38<1:31:34, 5.71s/it]g-point operations will not be computed-02 21:49:14,372 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████▎ | 56/1019 [04:38<1:31:34, 5.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:20,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 57/1019 [04:44<1:33:03, 5.80s/it]g-point operations will not be computed-02 21:49:20,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 57/1019 [04:44<1:33:03, 5.80s/it]g-point operations will not be computed-02 21:49:20,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 57/1019 [04:44<1:33:03, 5.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:26,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 58/1019 [04:50<1:32:55, 5.80s/it]g-point operations will not be computed-02 21:49:26,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 58/1019 [04:50<1:32:55, 5.80s/it]g-point operations will not be computed-02 21:49:26,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 58/1019 [04:50<1:32:55, 5.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:32,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▌ | 59/1019 [04:56<1:32:56, 5.81s/it]g-point operations will not be computed-02 21:49:32,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▌ | 59/1019 [04:56<1:32:56, 5.81s/it]g-point operations will not be computed-02 21:49:32,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▌ | 59/1019 [04:56<1:32:56, 5.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:37,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 60/1019 [05:02<1:32:39, 5.80s/it]g-point operations will not be computed-02 21:49:37,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 60/1019 [05:02<1:32:39, 5.80s/it]g-point operations will not be computed-02 21:49:37,910 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 60/1019 [05:02<1:32:39, 5.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:43,767 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 61/1019 [05:07<1:32:23, 5.79s/it]g-point operations will not be computed-02 21:49:43,767 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 61/1019 [05:07<1:32:23, 5.79s/it]g-point operations will not be computed-02 21:49:43,767 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 61/1019 [05:07<1:32:23, 5.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:49,420 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▊ | 62/1019 [05:13<1:31:44, 5.75s/it]g-point operations will not be computed-02 21:49:49,420 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▊ | 62/1019 [05:13<1:31:44, 5.75s/it]g-point operations will not be computed-02 21:49:49,420 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▊ | 62/1019 [05:13<1:31:44, 5.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:49:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▉ | 63/1019 [05:19<1:30:55, 5.71s/it]g-point operations will not be computed-02 21:49:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▉ | 63/1019 [05:19<1:30:55, 5.71s/it]g-point operations will not be computed-02 21:49:55,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▉ | 63/1019 [05:19<1:30:55, 5.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:00,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▉ | 64/1019 [05:24<1:29:55, 5.65s/it]g-point operations will not be computed-02 21:50:00,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▉ | 64/1019 [05:24<1:29:55, 5.65s/it]g-point operations will not be computed-02 21:50:00,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▉ | 64/1019 [05:24<1:29:55, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:06,176 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 65/1019 [05:30<1:29:10, 5.61s/it]g-point operations will not be computed-02 21:50:06,176 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 65/1019 [05:30<1:29:10, 5.61s/it]g-point operations will not be computed-02 21:50:06,176 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 65/1019 [05:30<1:29:10, 5.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:11,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 66/1019 [05:35<1:28:17, 5.56s/it]g-point operations will not be computed-02 21:50:11,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 66/1019 [05:35<1:28:17, 5.56s/it]g-point operations will not be computed-02 21:50:11,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 66/1019 [05:35<1:28:17, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:17,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▏ | 67/1019 [05:41<1:28:09, 5.56s/it]g-point operations will not be computed-02 21:50:17,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▏ | 67/1019 [05:41<1:28:09, 5.56s/it]g-point operations will not be computed-02 21:50:17,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▏ | 67/1019 [05:41<1:28:09, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:22,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▏ | 67/1019 [05:41<1:28:09, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:22,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 68/1019 [05:46<1:27:40, 5.53s/it]g-point operations will not be computed-02 21:50:22,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 68/1019 [05:46<1:27:40, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:28,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 68/1019 [05:46<1:27:40, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:28,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 69/1019 [05:52<1:27:15, 5.51s/it]g-point operations will not be computed-02 21:50:28,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 69/1019 [05:52<1:27:15, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:33,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 69/1019 [05:52<1:27:15, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:33,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▍ | 70/1019 [05:57<1:28:23, 5.59s/it]g-point operations will not be computed-02 21:50:33,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▍ | 70/1019 [05:57<1:28:23, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:39,221 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▌ | 71/1019 [06:03<1:27:04, 5.51s/it]g-point operations will not be computed-02 21:50:39,221 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▌ | 71/1019 [06:03<1:27:04, 5.51s/it]g-point operations will not be computed-02 21:50:39,221 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▌ | 71/1019 [06:03<1:27:04, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:44,677 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▌ | 71/1019 [06:03<1:27:04, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:44,677 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▌ | 72/1019 [06:08<1:26:47, 5.50s/it]g-point operations will not be computed-02 21:50:44,677 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▌ | 72/1019 [06:08<1:26:47, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:50,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▌ | 72/1019 [06:08<1:26:47, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:50,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 73/1019 [06:13<1:26:14, 5.47s/it]g-point operations will not be computed-02 21:50:50,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 73/1019 [06:13<1:26:14, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:50:55,436 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 74/1019 [06:19<1:25:06, 5.40s/it]g-point operations will not be computed-02 21:50:55,436 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 74/1019 [06:19<1:25:06, 5.40s/it]g-point operations will not be computed-02 21:50:55,436 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 74/1019 [06:19<1:25:06, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:00,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 74/1019 [06:19<1:25:06, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:00,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▊ | 75/1019 [06:24<1:23:53, 5.33s/it]g-point operations will not be computed-02 21:51:00,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▊ | 75/1019 [06:24<1:23:53, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:05,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▊ | 75/1019 [06:24<1:23:53, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:05,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 76/1019 [06:29<1:22:52, 5.27s/it]g-point operations will not be computed-02 21:51:05,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 76/1019 [06:29<1:22:52, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:10,919 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 76/1019 [06:29<1:22:52, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:10,919 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|█████▉ | 77/1019 [06:34<1:21:48, 5.21s/it]g-point operations will not be computed-02 21:51:10,919 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|█████▉ | 77/1019 [06:34<1:21:48, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:16,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|█████▉ | 77/1019 [06:34<1:21:48, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:16,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████ | 78/1019 [06:39<1:21:15, 5.18s/it]g-point operations will not be computed-02 21:51:16,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████ | 78/1019 [06:39<1:21:15, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:21,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████ | 78/1019 [06:39<1:21:15, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:21,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████ | 79/1019 [06:44<1:20:53, 5.16s/it]g-point operations will not be computed-02 21:51:21,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████ | 79/1019 [06:44<1:20:53, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:26,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▏ | 80/1019 [06:49<1:19:59, 5.11s/it]g-point operations will not be computed-02 21:51:26,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▏ | 80/1019 [06:49<1:19:59, 5.11s/it]g-point operations will not be computed-02 21:51:26,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▏ | 80/1019 [06:49<1:19:59, 5.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:31,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 81/1019 [06:54<1:19:04, 5.06s/it]g-point operations will not be computed-02 21:51:31,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 81/1019 [06:54<1:19:04, 5.06s/it]g-point operations will not be computed-02 21:51:31,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 81/1019 [06:54<1:19:04, 5.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:36,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 81/1019 [06:54<1:19:04, 5.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:36,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 82/1019 [06:59<1:18:18, 5.01s/it]g-point operations will not be computed-02 21:51:36,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 82/1019 [06:59<1:18:18, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:40,986 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▍ | 83/1019 [07:04<1:17:36, 4.97s/it]g-point operations will not be computed-02 21:51:40,986 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▍ | 83/1019 [07:04<1:17:36, 4.97s/it]g-point operations will not be computed-02 21:51:40,986 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▍ | 83/1019 [07:04<1:17:36, 4.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:45,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 84/1019 [07:09<1:16:08, 4.89s/it]g-point operations will not be computed-02 21:51:45,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 84/1019 [07:09<1:16:08, 4.89s/it]g-point operations will not be computed-02 21:51:45,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 84/1019 [07:09<1:16:08, 4.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:50,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 84/1019 [07:09<1:16:08, 4.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:50,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 85/1019 [07:13<1:14:32, 4.79s/it]g-point operations will not be computed-02 21:51:50,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 85/1019 [07:13<1:14:32, 4.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:54,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▋ | 86/1019 [07:18<1:13:24, 4.72s/it]g-point operations will not be computed-02 21:51:54,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▋ | 86/1019 [07:18<1:13:24, 4.72s/it]g-point operations will not be computed-02 21:51:54,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▋ | 86/1019 [07:18<1:13:24, 4.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:51:59,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▋ | 87/1019 [07:22<1:11:53, 4.63s/it]g-point operations will not be computed-02 21:51:59,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▋ | 87/1019 [07:22<1:11:53, 4.63s/it]g-point operations will not be computed-02 21:51:59,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▋ | 87/1019 [07:22<1:11:53, 4.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:03,743 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▊ | 88/1019 [07:27<1:10:40, 4.56s/it]g-point operations will not be computed-02 21:52:03,743 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▊ | 88/1019 [07:27<1:10:40, 4.56s/it]g-point operations will not be computed-02 21:52:03,743 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▊ | 88/1019 [07:27<1:10:40, 4.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:08,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 89/1019 [07:31<1:08:37, 4.43s/it]g-point operations will not be computed-02 21:52:08,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 89/1019 [07:31<1:08:37, 4.43s/it]g-point operations will not be computed-02 21:52:08,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 89/1019 [07:31<1:08:37, 4.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:12,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 90/1019 [07:35<1:06:13, 4.28s/it]g-point operations will not be computed-02 21:52:12,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 90/1019 [07:35<1:06:13, 4.28s/it]g-point operations will not be computed-02 21:52:12,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 90/1019 [07:35<1:06:13, 4.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:15,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████ | 91/1019 [07:38<1:03:29, 4.11s/it]g-point operations will not be computed-02 21:52:15,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████ | 91/1019 [07:38<1:03:29, 4.11s/it]g-point operations will not be computed-02 21:52:15,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████ | 91/1019 [07:38<1:03:29, 4.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:19,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 92/1019 [07:42<1:00:37, 3.92s/it]g-point operations will not be computed-02 21:52:19,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 92/1019 [07:42<1:00:37, 3.92s/it]g-point operations will not be computed-02 21:52:19,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:52:24,480 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:52:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 21:52:24,480 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:52:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▍ | 93/1019 [07:45<57:43, 3.74s/it] + 9%|███████▍ | 93/1019 [07:45<57:43, 3.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:26,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▍ | 94/1019 [07:48<54:30, 3.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:26,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▍ | 94/1019 [07:48<54:30, 3.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:26,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 95/1019 [07:51<51:04, 3.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:29,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 95/1019 [07:51<51:04, 3.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:29,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 95/1019 [07:51<51:04, 3.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:31,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 95/1019 [07:51<51:04, 3.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:31,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▋ | 96/1019 [07:54<47:26, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:34,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▋ | 96/1019 [07:54<47:26, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:34,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▋ | 97/1019 [07:56<43:21, 2.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:36,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▋ | 97/1019 [07:56<43:21, 2.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:36,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 99/1019 [08:00<35:31, 2.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:38,021 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 99/1019 [08:00<35:31, 2.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:38,021 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 100/1019 [08:02<33:54, 2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:39,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 100/1019 [08:02<33:54, 2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:39,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 100/1019 [08:02<33:54, 2.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:44,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 101/1019 [08:08<52:49, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:44,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 101/1019 [08:08<52:49, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:44,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 101/1019 [08:08<52:49, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:50,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 102/1019 [08:14<1:04:18, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:50,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 102/1019 [08:14<1:04:18, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:50,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 102/1019 [08:14<1:04:18, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:56,168 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 103/1019 [08:20<1:12:36, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:56,168 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 103/1019 [08:20<1:12:36, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:52:56,168 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 103/1019 [08:20<1:12:36, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:02,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 104/1019 [08:26<1:17:24, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:02,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 104/1019 [08:26<1:17:24, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:02,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▉ | 104/1019 [08:26<1:17:24, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:07,956 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████ | 105/1019 [08:32<1:21:10, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:07,956 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████ | 105/1019 [08:32<1:21:10, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:07,956 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████ | 105/1019 [08:32<1:21:10, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:13,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████ | 106/1019 [08:38<1:23:42, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:13,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████ | 106/1019 [08:38<1:23:42, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:13,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████ | 106/1019 [08:38<1:23:42, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:19,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▏ | 107/1019 [08:43<1:25:05, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:19,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▏ | 107/1019 [08:43<1:25:05, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:19,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▏ | 107/1019 [08:43<1:25:05, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:25,569 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▏ | 107/1019 [08:43<1:25:05, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:25,569 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▎ | 108/1019 [08:49<1:25:51, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:25,569 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▎ | 108/1019 [08:49<1:25:51, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:31,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▎ | 109/1019 [08:55<1:26:10, 5.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:31,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▎ | 109/1019 [08:55<1:26:10, 5.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:31,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▎ | 109/1019 [08:55<1:26:10, 5.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:37,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▍ | 110/1019 [09:01<1:26:19, 5.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:37,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▍ | 110/1019 [09:01<1:26:19, 5.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:37,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▍ | 110/1019 [09:01<1:26:19, 5.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:42,691 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▍ | 110/1019 [09:01<1:26:19, 5.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:42,691 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▍ | 111/1019 [09:06<1:25:46, 5.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:42,691 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▍ | 111/1019 [09:06<1:25:46, 5.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:48,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▍ | 111/1019 [09:06<1:25:46, 5.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:48,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 112/1019 [09:12<1:25:26, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:48,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 112/1019 [09:12<1:25:26, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:53,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 112/1019 [09:12<1:25:26, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:53,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▋ | 113/1019 [09:17<1:25:00, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:53,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▋ | 113/1019 [09:17<1:25:00, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:59,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▋ | 113/1019 [09:17<1:25:00, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:59,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▋ | 114/1019 [09:23<1:24:25, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:53:59,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▋ | 114/1019 [09:23<1:24:25, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:04,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + [WARNING|modeling_utils.py:388] 2022-03-02 21:54:04,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + [WARNING|modeling_utils.py:388] 2022-03-02 21:54:04,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 115/1019 [09:28<1:24:02, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:04,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 115/1019 [09:28<1:24:02, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:10,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 115/1019 [09:28<1:24:02, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:10,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▉ | 116/1019 [09:34<1:23:47, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:10,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▉ | 116/1019 [09:34<1:23:47, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:16,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▉ | 116/1019 [09:34<1:23:47, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:16,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▉ | 117/1019 [09:40<1:23:36, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:16,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▉ | 117/1019 [09:40<1:23:36, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:21,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▉ | 117/1019 [09:40<1:23:36, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:21,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▉ | 117/1019 [09:40<1:23:36, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:21,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████ | 118/1019 [09:45<1:22:32, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:21,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████ | 118/1019 [09:45<1:22:32, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:26,810 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████ | 118/1019 [09:45<1:22:32, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:26,810 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████ | 119/1019 [09:50<1:21:38, 5.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:26,810 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████ | 119/1019 [09:50<1:21:38, 5.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:32,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████ | 119/1019 [09:50<1:21:38, 5.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:32,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▏ | 120/1019 [09:56<1:21:25, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:32,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▏ | 120/1019 [09:56<1:21:25, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:37,615 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 121/1019 [10:01<1:20:48, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:37,615 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 121/1019 [10:01<1:20:48, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:37,615 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 121/1019 [10:01<1:20:48, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:42,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 121/1019 [10:01<1:20:48, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:42,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 122/1019 [10:06<1:20:47, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:42,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 122/1019 [10:06<1:20:47, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:48,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 122/1019 [10:06<1:20:47, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:48,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▎ | 122/1019 [10:06<1:20:47, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:48,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 123/1019 [10:12<1:20:07, 5.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:48,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 123/1019 [10:12<1:20:07, 5.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:53,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 123/1019 [10:12<1:20:07, 5.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:53,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 124/1019 [10:17<1:19:23, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:53,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 124/1019 [10:17<1:19:23, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:58,749 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▌ | 125/1019 [10:22<1:18:24, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:58,749 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▌ | 125/1019 [10:22<1:18:24, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:54:58,749 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▌ | 125/1019 [10:22<1:18:24, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:03,837 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▋ | 126/1019 [10:27<1:17:40, 5.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:03,837 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▋ | 126/1019 [10:27<1:17:40, 5.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:03,837 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▋ | 126/1019 [10:27<1:17:40, 5.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:09,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▋ | 127/1019 [10:32<1:17:22, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:09,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▋ | 127/1019 [10:32<1:17:22, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:09,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▋ | 127/1019 [10:32<1:17:22, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:14,067 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▋ | 127/1019 [10:32<1:17:22, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:14,067 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▊ | 128/1019 [10:37<1:16:24, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:14,067 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▊ | 128/1019 [10:37<1:16:24, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:19,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▊ | 129/1019 [10:42<1:15:42, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:19,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▊ | 129/1019 [10:42<1:15:42, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:19,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▊ | 129/1019 [10:42<1:15:42, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:24,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▉ | 130/1019 [10:47<1:14:51, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:24,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▉ | 130/1019 [10:47<1:14:51, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:24,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▉ | 130/1019 [10:47<1:14:51, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:28,933 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|█████████▉ | 130/1019 [10:47<1:14:51, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:28,933 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 131/1019 [10:52<1:13:59, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:28,933 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 131/1019 [10:52<1:13:59, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:33,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 131/1019 [10:52<1:13:59, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:33,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 132/1019 [10:57<1:12:55, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:33,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 132/1019 [10:57<1:12:55, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:38,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▏ | 133/1019 [11:02<1:12:04, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:38,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▏ | 133/1019 [11:02<1:12:04, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:38,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▏ | 133/1019 [11:02<1:12:04, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:43,313 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▏ | 133/1019 [11:02<1:12:04, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:43,313 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▎ | 134/1019 [11:06<1:11:07, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:43,313 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▎ | 134/1019 [11:06<1:11:07, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:47,905 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▎ | 134/1019 [11:06<1:11:07, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:47,905 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▎ | 135/1019 [11:11<1:10:00, 4.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:47,905 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▎ | 135/1019 [11:11<1:10:00, 4.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:52,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▎ | 135/1019 [11:11<1:10:00, 4.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:52,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 136/1019 [11:15<1:09:06, 4.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:52,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 136/1019 [11:15<1:09:06, 4.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:57,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 137/1019 [11:20<1:07:52, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:57,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 137/1019 [11:20<1:07:52, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:55:57,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 137/1019 [11:20<1:07:52, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:01,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▌ | 138/1019 [11:24<1:06:15, 4.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:01,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▌ | 138/1019 [11:24<1:06:15, 4.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:01,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▌ | 138/1019 [11:24<1:06:15, 4.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:05,525 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▋ | 139/1019 [11:28<1:04:26, 4.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:05,525 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▋ | 139/1019 [11:28<1:04:26, 4.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:05,525 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▋ | 139/1019 [11:28<1:04:26, 4.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:09,623 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▋ | 140/1019 [11:32<1:02:36, 4.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:09,623 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▋ | 140/1019 [11:32<1:02:36, 4.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:09,623 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▋ | 140/1019 [11:32<1:02:36, 4.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:13,496 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▋ | 140/1019 [11:32<1:02:36, 4.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:13,496 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▊ | 141/1019 [11:36<1:00:12, 4.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:13,496 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▊ | 141/1019 [11:36<1:00:12, 4.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:17,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|██████████▊ | 141/1019 [11:36<1:00:12, 4.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:17,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▏ | 142/1019 [11:39<57:04, 3.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:20,339 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▏ | 143/1019 [11:43<53:45, 3.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:20,339 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▏ | 143/1019 [11:43<53:45, 3.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:20,339 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▏ | 143/1019 [11:43<53:45, 3.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:23,359 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▏ | 143/1019 [11:43<53:45, 3.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:23,359 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 144/1019 [11:46<50:22, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:26,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 144/1019 [11:46<50:22, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:26,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▍ | 145/1019 [11:48<46:25, 3.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:26,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▍ | 145/1019 [11:48<46:25, 3.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:26,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▍ | 146/1019 [11:50<42:39, 2.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:28,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▍ | 146/1019 [11:50<42:39, 2.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:28,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▌ | 147/1019 [11:53<39:00, 2.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:30,812 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▌ | 147/1019 [11:53<39:00, 2.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:30,812 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▌ | 148/1019 [11:54<35:19, 2.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:34,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▌ | 148/1019 [11:54<35:19, 2.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:34,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 149/1019 [11:56<32:14, 2.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:36,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 149/1019 [11:56<32:14, 2.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:36,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 150/1019 [11:58<31:20, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:36,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 150/1019 [11:58<31:20, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:40,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 150/1019 [11:58<31:20, 2.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:40,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 151/1019 [12:05<49:45, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:40,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 151/1019 [12:05<49:45, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:46,858 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 151/1019 [12:05<49:45, 3.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:46,858 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 152/1019 [12:11<1:00:46, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:46,858 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 152/1019 [12:11<1:00:46, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 152/1019 [12:11<1:00:46, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 153/1019 [12:17<1:08:44, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 153/1019 [12:17<1:08:44, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 153/1019 [12:17<1:08:44, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 154/1019 [12:22<1:13:32, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 154/1019 [12:22<1:13:32, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 154/1019 [12:22<1:13:32, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 155/1019 [12:28<1:16:48, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 155/1019 [12:28<1:16:48, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 155/1019 [12:28<1:16:48, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▊ | 155/1019 [12:28<1:16:48, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3932, 'learning_rate': 9.18e-05, 'epoch': 0.15} + 15%|███████████▊ | 155/1019 [12:28<1:16:48, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████ | 157/1019 [12:40<1:19:38, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████ | 157/1019 [12:40<1:19:38, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3333, 'learning_rate': 9.24e-05, 'epoch': 0.15} + 15%|████████████ | 157/1019 [12:40<1:19:38, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████ | 158/1019 [12:46<1:20:50, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████ | 158/1019 [12:46<1:20:50, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████ | 158/1019 [12:46<1:20:50, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▏ | 159/1019 [12:51<1:20:58, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▏ | 159/1019 [12:51<1:20:58, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▏ | 159/1019 [12:51<1:20:58, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▏ | 159/1019 [12:51<1:20:58, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3985, 'learning_rate': 9.419999999999999e-05, 'epoch': 0.16} + 16%|████████████▏ | 159/1019 [12:51<1:20:58, 5.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▎ | 161/1019 [13:03<1:20:27, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▎ | 161/1019 [13:03<1:20:27, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3598, 'learning_rate': 9.479999999999999e-05, 'epoch': 0.16} + 16%|████████████▎ | 161/1019 [13:03<1:20:27, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▍ | 162/1019 [13:08<1:19:49, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▍ | 162/1019 [13:08<1:19:49, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▍ | 162/1019 [13:08<1:19:49, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▍ | 163/1019 [13:14<1:19:31, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▍ | 163/1019 [13:14<1:19:31, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + [WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + [WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3323, 'learning_rate': 9.659999999999999e-05, 'epoch': 0.16} + [WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▋ | 165/1019 [13:25<1:18:56, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▋ | 165/1019 [13:25<1:18:56, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3129, 'learning_rate': 9.719999999999999e-05, 'epoch': 0.16} + 16%|████████████▋ | 165/1019 [13:25<1:18:56, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▋ | 166/1019 [13:30<1:18:37, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▋ | 166/1019 [13:30<1:18:37, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▋ | 166/1019 [13:30<1:18:37, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▊ | 167/1019 [13:36<1:18:15, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▊ | 167/1019 [13:36<1:18:15, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▊ | 167/1019 [13:36<1:18:15, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▊ | 168/1019 [13:41<1:17:45, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▊ | 168/1019 [13:41<1:17:45, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▊ | 168/1019 [13:41<1:17:45, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|████████████▉ | 169/1019 [13:46<1:16:51, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|████████████▉ | 169/1019 [13:46<1:16:51, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████ | 170/1019 [13:52<1:16:03, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████ | 170/1019 [13:52<1:16:03, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2216, 'learning_rate': 0.0001002, 'epoch': 0.17} + 17%|█████████████ | 170/1019 [13:52<1:16:03, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████ | 171/1019 [13:57<1:15:08, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████ | 171/1019 [13:57<1:15:08, 5.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▏ | 172/1019 [14:02<1:14:27, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▏ | 172/1019 [14:02<1:14:27, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5884, 'learning_rate': 0.0001014, 'epoch': 0.17} + 17%|█████████████▏ | 172/1019 [14:02<1:14:27, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▏ | 173/1019 [14:07<1:13:49, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▏ | 173/1019 [14:07<1:13:49, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▎ | 174/1019 [14:12<1:12:50, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▎ | 174/1019 [14:12<1:12:50, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.397, 'learning_rate': 0.0001026, 'epoch': 0.17} + 17%|█████████████▎ | 174/1019 [14:12<1:12:50, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▎ | 174/1019 [14:12<1:12:50, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3738, 'learning_rate': 0.00010319999999999999, 'epoch': 0.17} + 17%|█████████████▎ | 174/1019 [14:12<1:12:50, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▍ | 176/1019 [14:22<1:11:29, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▍ | 176/1019 [14:22<1:11:29, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3302, 'learning_rate': 0.00010379999999999999, 'epoch': 0.17} + 17%|█████████████▌ | 177/1019 [14:27<1:11:23, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▌ | 177/1019 [14:27<1:11:23, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3525, 'learning_rate': 0.00010439999999999999, 'epoch': 0.17} + 17%|█████████████▋ | 178/1019 [14:32<1:10:42, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▋ | 178/1019 [14:32<1:10:42, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3466, 'learning_rate': 0.00010499999999999999, 'epoch': 0.17} + 18%|█████████████▋ | 179/1019 [14:37<1:09:59, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|█���███████████▋ | 179/1019 [14:37<1:09:59, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2784, 'learning_rate': 0.00010559999999999998, 'epoch': 0.18} + 18%|█████████████▋ | 179/1019 [14:37<1:09:59, 5.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|█████████████▊ | 180/1019 [14:42<1:09:13, 4.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|█████████████▊ | 180/1019 [14:42<1:09:13, 4.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|█████████████▊ | 181/1019 [14:47<1:08:24, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|█████████████▊ | 181/1019 [14:47<1:08:24, 4.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5627, 'learning_rate': 0.00010679999999999998, 'epoch': 0.18} + 18%|█████████████▉ | 182/1019 [14:52<1:07:56, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|█████████████▉ | 182/1019 [14:52<1:07:56, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4303, 'learning_rate': 0.00010739999999999998, 'epoch': 0.18} + 18%|█████████████▉ | 182/1019 [14:52<1:07:56, 4.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:56:52,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████ | 183/1019 [14:56<1:07:01, 4.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████ | 184/1019 [15:01<1:06:15, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████ | 184/1019 [15:01<1:06:15, 4.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3409, 'learning_rate': 0.00010859999999999998, 'epoch': 0.18} + 18%|██████████████▏ | 185/1019 [15:05<1:05:12, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████▏ | 185/1019 [15:05<1:05:12, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4081, 'learning_rate': 0.00010919999999999998, 'epoch': 0.18} + 18%|██████████████▏ | 186/1019 [15:10<1:04:08, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████▏ | 186/1019 [15:10<1:04:08, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4373, 'learning_rate': 0.00010979999999999999, 'epoch': 0.18} + 18%|██████████████▎ | 187/1019 [15:14<1:02:40, 4.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████▎ | 187/1019 [15:14<1:02:40, 4.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4595, 'learning_rate': 0.00011039999999999999, 'epoch': 0.18} + 18%|██████████████▍ | 188/1019 [15:18<1:01:38, 4.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████▍ | 188/1019 [15:18<1:01:38, 4.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2412, 'learning_rate': 0.00011099999999999999, 'epoch': 0.18} + 18%|██████████████▍ | 188/1019 [15:18<1:01:38, 4.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▍ | 189/1019 [15:22<1:00:04, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▍ | 189/1019 [15:22<1:00:04, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▍ | 189/1019 [15:22<1:00:04, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▉ | 190/1019 [15:26<58:29, 4.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▉ | 190/1019 [15:26<58:29, 4.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▉ | 190/1019 [15:26<58:29, 4.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▉ | 191/1019 [15:30<56:25, 4.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▉ | 191/1019 [15:30<56:25, 4.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|██████████████▉ | 191/1019 [15:30<56:25, 4.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████ | 192/1019 [15:34<54:14, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████��███ | 192/1019 [15:34<54:14, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:00:16,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:00:16,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:00:16,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▏ | 194/1019 [15:40<48:39, 3.54s/it]g-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▏ | 194/1019 [15:40<48:39, 3.54s/it]g-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:00:22,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:00:22,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:00:22,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 21:59:37,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▍ | 196/1019 [15:46<42:18, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:26,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▍ | 196/1019 [15:46<42:18, 3.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:26,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▍ | 197/1019 [15:48<39:07, 2.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:28,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▍ | 197/1019 [15:48<39:07, 2.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:28,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▌ | 198/1019 [15:50<35:50, 2.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:30,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▌ | 198/1019 [15:50<35:50, 2.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:30,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▌ | 199/1019 [15:52<32:25, 2.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▌ | 199/1019 [15:52<32:25, 2.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|█████████���█████▋ | 200/1019 [15:54<30:45, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▋ | 200/1019 [15:54<30:45, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▋ | 200/1019 [15:54<30:45, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▋ | 200/1019 [15:54<30:45, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6225, 'learning_rate': 0.0001188, 'epoch': 0.2} + 20%|███████████████▋ | 200/1019 [15:54<30:45, 2.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▊ | 202/1019 [16:06<57:04, 4.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▊ | 202/1019 [16:06<57:04, 4.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.449, 'learning_rate': 0.0001194, 'epoch': 0.2} + 20%|███████████████▌ | 203/1019 [16:12<1:03:53, 4.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▌ | 203/1019 [16:12<1:03:53, 4.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5392, 'learning_rate': 0.00011999999999999999, 'epoch': 0.2} + 20%|███████████████▌ | 204/1019 [16:18<1:08:24, 5.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▌ | 204/1019 [16:18<1:08:24, 5.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.386, 'learning_rate': 0.00012059999999999999, 'epoch': 0.2} + 20%|███████████████▌ | 204/1019 [16:18<1:08:24, 5.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▋ | 205/1019 [16:23<1:10:58, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▋ | 205/1019 [16:23<1:10:58, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▋ | 205/1019 [16:23<1:10:58, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▋ | 205/1019 [16:23<1:10:58, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5596, 'learning_rate': 0.00012179999999999999, 'epoch': 0.2} + 20%|███████████████▋ | 205/1019 [16:23<1:10:58, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▊ | 207/1019 [16:34<1:13:30, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▊ | 207/1019 [16:34<1:13:30, 5.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3989, 'learning_rate': 0.0001224, 'epoch': 0.2} + 20%|███████████████▉ | 208/1019 [16:40<1:14:14, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|███████████████▉ | 208/1019 [16:40<1:14:14, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3484, 'learning_rate': 0.00012299999999999998, 'epoch': 0.2} + 20%|███████████████▉ | 208/1019 [16:40<1:14:14, 5.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|███████████████▉ | 209/1019 [16:46<1:14:23, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|███████████████▉ | 209/1019 [16:46<1:14:23, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|███████████████▉ | 209/1019 [16:46<1:14:23, 5.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████ | 210/1019 [16:51<1:14:53, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████ | 210/1019 [16:51<1:14:53, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▏ | 211/1019 [16:57<1:14:57, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▏ | 211/1019 [16:57<1:14:57, 5.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4333, 'learning_rate': 0.00012479999999999997, 'epoch': 0.21} + 21%|████████████████▏ | 212/1019 [17:02<1:14:44, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▏ | 212/1019 [17:02<1:14:44, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2828, 'learning_rate': 0.00012539999999999999, 'epoch': 0.21} + [WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + [WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2647, 'learning_rate': 0.00012599999999999997, 'epoch': 0.21} + [WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▍ | 214/1019 [17:13<1:13:49, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▍ | 214/1019 [17:13<1:13:49, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8459, 'learning_rate': 0.0001266, 'epoch': 0.21} + 21%|████████████████▍ | 215/1019 [17:19<1:13:18, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▍ | 215/1019 [17:19<1:13:18, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4322, 'learning_rate': 0.00012719999999999997, 'epoch': 0.21} + 21%|████████████████▌ | 216/1019 [17:24<1:12:29, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▌ | 216/1019 [17:24<1:12:29, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 5.006, 'learning_rate': 0.0001278, 'epoch': 0.21} + 21%|████████████████▌ | 217/1019 [17:29<1:12:07, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▌ | 217/1019 [17:29<1:12:07, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4751, 'learning_rate': 0.00012839999999999998, 'epoch': 0.21} + 21%|████████████████▌ | 217/1019 [17:29<1:12:07, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 218/1019 [17:35<1:12:15, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 218/1019 [17:35<1:12:15, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▊ | 219/1019 [17:40<1:11:47, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▊ | 219/1019 [17:40<1:11:47, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5713, 'learning_rate': 0.00012959999999999998, 'epoch': 0.21} + 22%|████████████████▊ | 220/1019 [17:45<1:11:08, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|████████████████▊ | 220/1019 [17:45<1:11:08, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3418, 'learning_rate': 0.0001302, 'epoch': 0.22} + 22%|████████████████▉ | 221/1019 [17:50<1:10:03, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|████████████████▉ | 221/1019 [17:50<1:10:03, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2869, 'learning_rate': 0.00013079999999999998, 'epoch': 0.22} + 22%|████████████████▉ | 222/1019 [17:56<1:09:32, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|████████████████▉ | 222/1019 [17:56<1:09:32, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5247, 'learning_rate': 0.0001314, 'epoch': 0.22} + 22%|█████████████████ | 223/1019 [18:01<1:09:09, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████ | 223/1019 [18:01<1:09:09, 5.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4046, 'learning_rate': 0.00013199999999999998, 'epoch': 0.22} + 22%|█████████████████▏ | 224/1019 [18:06<1:08:19, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▏ | 224/1019 [18:06<1:08:19, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6572, 'learning_rate': 0.0001326, 'epoch': 0.22} + 22%|█████████████████▏ | 224/1019 [18:06<1:08:19, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▏ | 225/1019 [18:11<1:07:59, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▏ | 225/1019 [18:11<1:07:59, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▏ | 225/1019 [18:11<1:07:59, 5.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 226/1019 [18:16<1:07:17, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 226/1019 [18:16<1:07:17, 5.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▍ | 227/1019 [18:21<1:07:01, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▍ | 227/1019 [18:21<1:07:01, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.467, 'learning_rate': 0.0001344, 'epoch': 0.22} + 22%|█████████████████▍ | 227/1019 [18:21<1:07:01, 5.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▍ | 228/1019 [18:26<1:06:11, 5.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▍ | 228/1019 [18:26<1:06:11, 5.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▌ | 229/1019 [18:31<1:05:19, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▌ | 229/1019 [18:31<1:05:19, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.382, 'learning_rate': 0.0001356, 'epoch': 0.22} + 22%|█████████████████▌ | 229/1019 [18:31<1:05:19, 4.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▌ | 230/1019 [18:35<1:04:37, 4.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▌ | 230/1019 [18:35<1:04:37, 4.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▌ | 230/1019 [18:35<1:04:37, 4.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▋ | 231/1019 [18:40<1:03:30, 4.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▋ | 231/1019 [18:40<1:03:30, 4.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▊ | 232/1019 [18:45<1:03:12, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▊ | 232/1019 [18:45<1:03:12, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3485, 'learning_rate': 0.0001374, 'epoch': 0.23} + 23%|█████████████████▊ | 233/1019 [18:49<1:02:09, 4.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▊ | 233/1019 [18:49<1:02:09, 4.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4238, 'learning_rate': 0.000138, 'epoch': 0.23} + 23%|█████████████████▉ | 234/1019 [18:54<1:01:33, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▉ | 234/1019 [18:54<1:01:33, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3862, 'learning_rate': 0.0001386, 'epoch': 0.23} + 23%|█████████████████▉ | 235/1019 [18:59<1:00:54, 4.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|█████████████████▉ | 235/1019 [18:59<1:00:54, 4.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4833, 'learning_rate': 0.0001392, 'epoch': 0.23} + 23%|██████████████████▌ | 236/1019 [19:03<59:32, 4.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 236/1019 [19:03<59:32, 4.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.303, 'learning_rate': 0.00013979999999999998, 'epoch': 0.23} + 23%|██████████████████▌ | 237/1019 [19:07<58:34, 4.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 237/1019 [19:07<58:34, 4.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3132, 'learning_rate': 0.0001404, 'epoch': 0.23} + 23%|██████████████████▋ | 238/1019 [19:12<57:32, 4.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▋ | 238/1019 [19:12<57:32, 4.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5828, 'learning_rate': 0.00014099999999999998, 'epoch': 0.23} + 23%|██████████████████▊ | 239/1019 [19:16<56:19, 4.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▊ | 239/1019 [19:16<56:19, 4.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2905, 'learning_rate': 0.00014159999999999997, 'epoch': 0.23} + 24%|██████████████████▊ | 240/1019 [19:20<54:30, 4.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▊ | 240/1019 [19:20<54:30, 4.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8632, 'learning_rate': 0.0001422, 'epoch': 0.24} + 24%|██████████████████▉ | 241/1019 [19:23<52:43, 4.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 241/1019 [19:23<52:43, 4.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3419, 'learning_rate': 0.00014279999999999997, 'epoch': 0.24} + 24%|██████████████████▉ | 241/1019 [19:23<52:43, 4.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 242/1019 [19:27<49:58, 3.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:04:09,174 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:04:09,174 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:04:12,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:04:12,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:00:31,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6121, 'learning_rate': 0.0001446, 'epoch': 0.24} + 24%|███████████████████▏ | 245/1019 [19:35<41:01, 3.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:15,926 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 245/1019 [19:35<41:01, 3.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:15,926 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▎ | 246/1019 [19:38<37:51, 2.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:18,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▎ | 246/1019 [19:38<37:51, 2.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:18,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▍ | 247/1019 [19:40<34:20, 2.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:20,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▍ | 247/1019 [19:40<34:20, 2.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:20,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▍ | 248/1019 [19:42<30:55, 2.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:21,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▍ | 248/1019 [19:42<30:55, 2.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:21,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 5.6138, 'learning_rate': 0.000147, 'epoch': 0.24} + 24%|███████████████████▌ | 249/1019 [19:43<27:50, 2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 249/1019 [19:43<27:50, 2.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 250/1019 [19:45<26:44, 2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 250/1019 [19:45<26:44, 2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 250/1019 [19:45<26:44, 2.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 251/1019 [19:51<42:41, 3.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 251/1019 [19:51<42:41, 3.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 251/1019 [19:51<42:41, 3.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 252/1019 [19:57<52:32, 4.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 252/1019 [19:57<52:32, 4.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 252/1019 [19:57<52:32, 4.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 253/1019 [20:03<59:10, 4.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 253/1019 [20:03<59:10, 4.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|█████████���█████████▍ | 254/1019 [20:09<1:03:34, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▍ | 254/1019 [20:09<1:03:34, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2367, 'learning_rate': 0.00015059999999999997, 'epoch': 0.25} + 25%|███████████████████▌ | 255/1019 [20:15<1:06:27, 5.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▌ | 255/1019 [20:15<1:06:27, 5.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3365, 'learning_rate': 0.0001512, 'epoch': 0.25} + 25%|███████████████████▌ | 256/1019 [20:21<1:08:35, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▌ | 256/1019 [20:21<1:08:35, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3775, 'learning_rate': 0.00015179999999999998, 'epoch': 0.25} + 25%|███████████████████▌ | 256/1019 [20:21<1:08:35, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 257/1019 [20:26<1:09:30, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 257/1019 [20:26<1:09:30, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 257/1019 [20:26<1:09:30, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 258/1019 [20:32<1:10:04, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▋ | 258/1019 [20:32<1:10:04, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 259/1019 [20:38<1:10:39, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 259/1019 [20:38<1:10:39, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3316, 'learning_rate': 0.0001536, 'epoch': 0.25} + 26%|███████████████████▉ | 260/1019 [20:43<1:10:47, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|███████████████████▉ | 260/1019 [20:43<1:10:47, 5.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6933, 'learning_rate': 0.00015419999999999998, 'epoch': 0.26} + 26%|███████████████████▉ | 261/1019 [20:49<1:10:31, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|███████████████████▉ | 261/1019 [20:49<1:10:31, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3414, 'learning_rate': 0.0001548, 'epoch': 0.26} + 26%|████████████████████ | 262/1019 [20:54<1:09:56, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████ | 262/1019 [20:54<1:09:56, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4034, 'learning_rate': 0.00015539999999999998, 'epoch': 0.26} + 26%|████████████████████▏ | 263/1019 [21:00<1:09:55, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▏ | 263/1019 [21:00<1:09:55, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.383, 'learning_rate': 0.000156, 'epoch': 0.26} + 26%|████████████████████▏ | 264/1019 [21:05<1:09:34, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▏ | 264/1019 [21:05<1:09:34, 5.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5517, 'learning_rate': 0.00015659999999999998, 'epoch': 0.26} + 26%|████████████████████▎ | 265/1019 [21:11<1:08:54, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▎ | 265/1019 [21:11<1:08:54, 5.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2632, 'learning_rate': 0.0001572, 'epoch': 0.26} + 26%|████████████████████▎ | 266/1019 [21:16<1:08:36, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▎ | 266/1019 [21:16<1:08:36, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6418, 'learning_rate': 0.0001578, 'epoch': 0.26} + 26%|████████████████████▎ | 266/1019 [21:16<1:08:36, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 267/1019 [21:21<1:08:25, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 267/1019 [21:21<1:08:25, 5.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▌ | 268/1019 [21:27<1:07:45, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▌ | 268/1019 [21:27<1:07:45, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6749, 'learning_rate': 0.000159, 'epoch': 0.26} + 26%|████████████████████▌ | 269/1019 [21:32<1:07:24, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▌ | 269/1019 [21:32<1:07:24, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.546, 'learning_rate': 0.0001596, 'epoch': 0.26} + 26%|████████████████████▌ | 269/1019 [21:32<1:07:24, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▋ | 270/1019 [21:37<1:07:11, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▋ | 270/1019 [21:37<1:07:11, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▋ | 271/1019 [21:43<1:06:36, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▋ | 271/1019 [21:43<1:06:36, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3069, 'learning_rate': 0.0001608, 'epoch': 0.27} + 27%|████████████████████▋ | 271/1019 [21:43<1:06:36, 5.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▊ | 272/1019 [21:48<1:05:59, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▊ | 272/1019 [21:48<1:05:59, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▉ | 273/1019 [21:53<1:05:20, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▉ | 273/1019 [21:53<1:05:20, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3899, 'learning_rate': 0.000162, 'epoch': 0.27} + 27%|████████████████████▉ | 273/1019 [21:53<1:05:20, 5.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▉ | 274/1019 [21:58<1:04:55, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|████████████████████▉ | 274/1019 [21:58<1:04:55, 5.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████ | 275/1019 [22:03<1:04:12, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████ | 275/1019 [22:03<1:04:12, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4663, 'learning_rate': 0.0001632, 'epoch': 0.27} + 27%|█████████████████████ | 275/1019 [22:03<1:04:12, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▏ | 276/1019 [22:08<1:03:56, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▏ | 276/1019 [22:08<1:03:56, 5.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▏ | 277/1019 [22:13<1:03:07, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▏ | 277/1019 [22:13<1:03:07, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9303, 'learning_rate': 0.0001644, 'epoch': 0.27} + 27%|█████████████████████▏ | 277/1019 [22:13<1:03:07, 5.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▎ | 278/1019 [22:18<1:02:20, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▎ | 278/1019 [22:18<1:02:20, 5.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▎ | 279/1019 [22:23<1:01:46, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▎ | 279/1019 [22:23<1:01:46, 5.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.528, 'learning_rate': 0.0001656, 'epoch': 0.27} + 27%|█████████████████████▍ | 280/1019 [22:28<1:01:17, 4.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 280/1019 [22:28<1:01:17, 4.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5544, 'learning_rate': 0.0001662, 'epoch': 0.27} + 27%|█████████████████████▍ | 280/1019 [22:28<1:01:17, 4.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|█████████████████████▌ | 281/1019 [22:33<1:00:24, 4.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|█████████████████████▌ | 281/1019 [22:33<1:00:24, 4.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▏ | 282/1019 [22:38<59:55, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▏ | 282/1019 [22:38<59:55, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4356, 'learning_rate': 0.0001674, 'epoch': 0.28} + 28%|██████████████████████▏ | 283/1019 [22:42<59:32, 4.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▏ | 283/1019 [22:42<59:32, 4.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3989, 'learning_rate': 0.000168, 'epoch': 0.28} + 28%|██████████████████████▏ | 283/1019 [22:42<59:32, 4.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 284/1019 [22:47<58:34, 4.78s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 284/1019 [22:47<58:34, 4.78s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 285/1019 [22:52<57:35, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 285/1019 [22:52<57:35, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9425, 'learning_rate': 0.00016919999999999997, 'epoch': 0.28} + 28%|██████████████████████▍ | 286/1019 [22:56<56:37, 4.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▍ | 286/1019 [22:56<56:37, 4.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.384, 'learning_rate': 0.00016979999999999998, 'epoch': 0.28} + 28%|██████████████████████▌ | 287/1019 [23:00<55:37, 4.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▌ | 287/1019 [23:00<55:37, 4.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4586, 'learning_rate': 0.00017039999999999997, 'epoch': 0.28} + 28%|██████████████████████▌ | 288/1019 [23:05<54:19, 4.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▌ | 288/1019 [23:05<54:19, 4.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6842, 'learning_rate': 0.00017099999999999998, 'epoch': 0.28} + 28%|██████████████████████▋ | 289/1019 [23:09<52:47, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 289/1019 [23:09<52:47, 4.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4578, 'learning_rate': 0.00017159999999999997, 'epoch': 0.28} + 28%|██████████████████████▊ | 290/1019 [23:13<51:07, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▊ | 290/1019 [23:13<51:07, 4.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8488, 'learning_rate': 0.00017219999999999998, 'epoch': 0.28} + 29%|██████████████████████▊ | 291/1019 [23:16<49:10, 4.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▊ | 291/1019 [23:16<49:10, 4.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.846, 'learning_rate': 0.00017279999999999997, 'epoch': 0.29} + 29%|██████████████████████▉ | 292/1019 [23:20<46:59, 3.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 292/1019 [23:20<46:59, 3.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:02,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:02,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6816, 'learning_rate': 0.00017399999999999997, 'epoch': 0.29} + 29%|███████████████████████ | 294/1019 [23:26<41:56, 3.47s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████ | 294/1019 [23:26<41:56, 3.47s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:08,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:08,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:10,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:10,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:12,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:12,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:14,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:14,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:16,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:16,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:18,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:08:18,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6737, 'learning_rate': 0.00017819999999999997, 'epoch': 0.29} + 30%|███████████████████████▋ | 301/1019 [23:45<39:50, 3.33s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▋ | 301/1019 [23:45<39:50, 3.33s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9764, 'learning_rate': 0.00017879999999999998, 'epoch': 0.3} + 30%|███████████████████████▋ | 302/1019 [23:51<49:18, 4.13s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▋ | 302/1019 [23:51<49:18, 4.13s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6205, 'learning_rate': 0.00017939999999999997, 'epoch': 0.3} + 30%|███████████████████████▊ | 303/1019 [23:57<55:23, 4.64s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▊ | 303/1019 [23:57<55:23, 4.64s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7629, 'learning_rate': 0.00017999999999999998, 'epoch': 0.3} + 30%|███████████████████████▊ | 303/1019 [23:57<55:23, 4.64s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▊ | 304/1019 [24:03<59:20, 4.98s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▊ | 304/1019 [24:03<59:20, 4.98s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▎ | 305/1019 [24:08<1:02:10, 5.23s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▎ | 305/1019 [24:08<1:02:10, 5.23s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.353, 'learning_rate': 0.00018119999999999999, 'epoch': 0.3} + 30%|███████████████████████▎ | 305/1019 [24:08<1:02:10, 5.23s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▍ | 306/1019 [24:14<1:03:48, 5.37s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▍ | 306/1019 [24:14<1:03:48, 5.37s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▍ | 306/1019 [24:14<1:03:48, 5.37s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▍ | 307/1019 [24:20<1:05:09, 5.49s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▍ | 307/1019 [24:20<1:05:09, 5.49s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▍ | 307/1019 [24:20<1:05:09, 5.49s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 308/1019 [24:26<1:05:50, 5.56s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 308/1019 [24:26<1:05:50, 5.56s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 308/1019 [24:26<1:05:50, 5.56s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 308/1019 [24:26<1:05:50, 5.56s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7656, 'learning_rate': 0.0001836, 'epoch': 0.3} + 30%|███████████████████████▌ | 308/1019 [24:26<1:05:50, 5.56s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▋ | 310/1019 [24:37<1:06:17, 5.61s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▋ | 310/1019 [24:37<1:06:17, 5.61s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7829, 'learning_rate': 0.00018419999999999998, 'epoch': 0.3} + 30%|███████████████████████▋ | 310/1019 [24:37<1:06:17, 5.61s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|███████████████████████▊ | 311/1019 [24:42<1:06:04, 5.60s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|███████████████████████▊ | 311/1019 [24:42<1:06:04, 5.60s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|███████████████████████▉ | 312/1019 [24:48<1:05:36, 5.57s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|███████████████████████▉ | 312/1019 [24:48<1:05:36, 5.57s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6093, 'learning_rate': 0.00018539999999999998, 'epoch': 0.31} + 31%|███████████████████████▉ | 312/1019 [24:48<1:05:36, 5.57s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|███████████████████████▉ | 313/1019 [24:53<1:05:06, 5.53s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|███████████████████████▉ | 313/1019 [24:53<1:05:06, 5.53s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.485, 'learning_rate': 0.000186, 'epoch': 0.31} + 31%|███████████████████████▉ | 313/1019 [24:53<1:05:06, 5.53s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████ | 314/1019 [24:59<1:04:42, 5.51s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████ | 314/1019 [24:59<1:04:42, 5.51s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████ | 314/1019 [24:59<1:04:42, 5.51s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████ | 314/1019 [24:59<1:04:42, 5.51s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4405, 'learning_rate': 0.0001872, 'epoch': 0.31} + 31%|████████████████████████ | 314/1019 [24:59<1:04:42, 5.51s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▏ | 316/1019 [25:10<1:04:13, 5.48s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▏ | 316/1019 [25:10<1:04:13, 5.48s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8867, 'learning_rate': 0.00018779999999999998, 'epoch': 0.31} + 31%|████████████████████████▏ | 316/1019 [25:10<1:04:13, 5.48s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▎ | 317/1019 [25:15<1:03:47, 5.45s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▎ | 317/1019 [25:15<1:03:47, 5.45s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▎ | 318/1019 [25:20<1:03:17, 5.42s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▎ | 318/1019 [25:20<1:03:17, 5.42s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5744, 'learning_rate': 0.00018899999999999999, 'epoch': 0.31} + 31%|████████████████████████▍ | 319/1019 [25:26<1:02:49, 5.38s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▍ | 319/1019 [25:26<1:02:49, 5.38s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6423, 'learning_rate': 0.00018959999999999997, 'epoch': 0.31} + 31%|████████████████████████▍ | 320/1019 [25:31<1:02:25, 5.36s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|████████████████████████▍ | 320/1019 [25:31<1:02:25, 5.36s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9354, 'learning_rate': 0.0001902, 'epoch': 0.31} + 32%|████████████████████████▌ | 321/1019 [25:36<1:02:01, 5.33s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|████████████████████████▌ | 321/1019 [25:36<1:02:01, 5.33s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3763, 'learning_rate': 0.00019079999999999998, 'epoch': 0.31} + 32%|████████████████████████▌ | 321/1019 [25:36<1:02:01, 5.33s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|████████████████████████▋ | 322/1019 [25:42<1:01:49, 5.32s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|████████████████████████▋ | 322/1019 [25:42<1:01:49, 5.32s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|████████████████████████▋ | 323/1019 [25:47<1:01:02, 5.26s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|████████████████████████▋ | 323/1019 [25:47<1:01:02, 5.26s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3223, 'learning_rate': 0.00019199999999999998, 'epoch': 0.32} + 32%|████████████████████████▊ | 324/1019 [25:52<1:00:10, 5.19s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|████████████████████████▊ | 324/1019 [25:52<1:00:10, 5.19s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3524, 'learning_rate': 0.0001926, 'epoch': 0.32} + 32%|█████████████████████████▌ | 325/1019 [25:57<59:12, 5.12s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▌ | 325/1019 [25:57<59:12, 5.12s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5332, 'learning_rate': 0.00019319999999999998, 'epoch': 0.32} + 32%|█████████████████████████▌ | 326/1019 [26:02<58:54, 5.10s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▌ | 326/1019 [26:02<58:54, 5.10s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5867, 'learning_rate': 0.0001938, 'epoch': 0.32} + 32%|█████████████████████████▋ | 327/1019 [26:07<58:25, 5.07s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▋ | 327/1019 [26:07<58:25, 5.07s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3014, 'learning_rate': 0.00019439999999999998, 'epoch': 0.32} + 32%|█████████████████████████▊ | 328/1019 [26:12<57:49, 5.02s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▊ | 328/1019 [26:12<57:49, 5.02s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5994, 'learning_rate': 0.000195, 'epoch': 0.32} + 32%|█████████████████████████▊ | 328/1019 [26:12<57:49, 5.02s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▊ | 328/1019 [26:12<57:49, 5.02s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▊ | 328/1019 [26:12<57:49, 5.02s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7089, 'learning_rate': 0.00019559999999999998, 'epoch': 0.32} + 32%|█████████████████████████▊ | 328/1019 [26:12<57:49, 5.02s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▉ | 330/1019 [26:21<56:51, 4.95s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▉ | 330/1019 [26:21<56:51, 4.95s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▉ | 331/1019 [26:26<56:29, 4.93s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▉ | 331/1019 [26:26<56:29, 4.93s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3585, 'learning_rate': 0.00019679999999999999, 'epoch': 0.32} + 32%|██████████████████████���██▉ | 331/1019 [26:26<56:29, 4.93s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▉ | 331/1019 [26:26<56:29, 4.93s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5627, 'learning_rate': 0.0001974, 'epoch': 0.33} + 32%|█████████████████████████▉ | 331/1019 [26:26<56:29, 4.93s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|█████████████████████████▉ | 331/1019 [26:26<56:29, 4.93s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▏ | 333/1019 [26:36<54:57, 4.81s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▏ | 333/1019 [26:36<54:57, 4.81s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▏ | 334/1019 [26:40<53:55, 4.72s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▏ | 334/1019 [26:40<53:55, 4.72s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5892, 'learning_rate': 0.0001986, 'epoch': 0.33} + 33%|██████████████████████████▎ | 335/1019 [26:45<53:11, 4.67s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▎ | 335/1019 [26:45<53:11, 4.67s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5918, 'learning_rate': 0.0001992, 'epoch': 0.33} + 33%|██████████████████████████▎ | 335/1019 [26:45<53:11, 4.67s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▍ | 337/1019 [26:53<50:58, 4.49s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:11:36,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:11:36,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4946, 'learning_rate': 0.000201, 'epoch': 0.33} + 33%|██████████████████████████▌ | 339/1019 [27:02<48:30, 4.28s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▌ | 339/1019 [27:02<48:30, 4.28s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.676, 'learning_rate': 0.0002016, 'epoch': 0.33} + 33%|██████████████████████████▌ | 339/1019 [27:02<48:30, 4.28s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▋ | 340/1019 [27:05<47:02, 4.16s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▋ | 340/1019 [27:05<47:02, 4.16s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▋ | 340/1019 [27:05<47:02, 4.16s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 341/1019 [27:09<45:09, 4.00s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 341/1019 [27:09<45:09, 4.00s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 341/1019 [27:09<45:09, 4.00s/it]g-point operations will not be computed-02 22:04:23,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|██████████████████████████▊ | 342/1019 [27:13<43:20, 3.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:53,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|██████████████████████████▊ | 342/1019 [27:13<43:20, 3.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:53,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|██████████████████████████▉ | 343/1019 [27:16<41:05, 3.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:53,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|██████████████████████████▉ | 343/1019 [27:16<41:05, 3.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:53,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|██████████████████████████▉ | 343/1019 [27:16<41:05, 3.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:53,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████ | 344/1019 [27:19<38:36, 3.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████ | 344/1019 [27:19<38:36, 3.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████ | 345/1019 [27:21<36:01, 3.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████ | 345/1019 [27:21<36:01, 3.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:03,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:03,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:05,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:05,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:07,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:07,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:08,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:08,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:10,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:10,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:12:10,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▌ | 351/1019 [27:37<37:04, 3.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▌ | 351/1019 [27:37<37:04, 3.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▌ | 351/1019 [27:37<37:04, 3.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 352/1019 [27:43<45:50, 4.12s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 352/1019 [27:43<45:50, 4.12s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 352/1019 [27:43<45:50, 4.12s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 353/1019 [27:49<51:35, 4.65s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 353/1019 [27:49<51:35, 4.65s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 353/1019 [27:49<51:35, 4.65s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 353/1019 [27:49<51:35, 4.65s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5144, 'learning_rate': 0.00021059999999999997, 'epoch': 0.35} + 35%|███████████████████████████▋ | 353/1019 [27:49<51:35, 4.65s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▊ | 355/1019 [28:01<57:53, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▊ | 355/1019 [28:01<57:53, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4082, 'learning_rate': 0.00021119999999999996, 'epoch': 0.35} + 35%|███████████████████████████▊ | 355/1019 [28:01<57:53, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▉ | 356/1019 [28:07<59:17, 5.37s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▉ | 356/1019 [28:07<59:17, 5.37s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▉ | 356/1019 [28:07<59:17, 5.37s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▎ | 357/1019 [28:12<1:00:54, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▎ | 357/1019 [28:12<1:00:54, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▎ | 357/1019 [28:12<1:00:54, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▍ | 358/1019 [28:18<1:01:29, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▍ | 358/1019 [28:18<1:01:29, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▍ | 358/1019 [28:18<1:01:29, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▍ | 359/1019 [28:24<1:01:44, 5.61s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▍ | 359/1019 [28:24<1:01:44, 5.61s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▍ | 359/1019 [28:24<1:01:44, 5.61s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▌ | 360/1019 [28:30<1:01:48, 5.63s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▌ | 360/1019 [28:30<1:01:48, 5.63s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▌ | 360/1019 [28:30<1:01:48, 5.63s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 361/1019 [28:35<1:01:34, 5.62s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 361/1019 [28:35<1:01:34, 5.62s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|███████████████████████████▋ | 361/1019 [28:35<1:01:34, 5.62s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|███████████████████████████▋ | 362/1019 [28:41<1:01:16, 5.60s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|███████████████████████████▋ | 362/1019 [28:41<1:01:16, 5.60s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|███████████████████████████▋ | 362/1019 [28:41<1:01:16, 5.60s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|███████████████████████████▊ | 363/1019 [28:46<1:00:40, 5.55s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|███████████████████████████▊ | 363/1019 [28:46<1:00:40, 5.55s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|███████████████████████████▊ | 364/1019 [28:52<1:00:10, 5.51s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|███████████████████████████▊ | 364/1019 [28:52<1:00:10, 5.51s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7992, 'learning_rate': 0.00021659999999999998, 'epoch': 0.36} + 36%|███████████████████████████▊ | 364/1019 [28:52<1:00:10, 5.51s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▋ | 365/1019 [28:57<59:51, 5.49s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▋ | 365/1019 [28:57<59:51, 5.49s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▋ | 365/1019 [28:57<59:51, 5.49s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▋ | 365/1019 [28:57<59:51, 5.49s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7664, 'learning_rate': 0.00021779999999999998, 'epoch': 0.36} + 36%|████████████████████████████▋ | 365/1019 [28:57<59:51, 5.49s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▊ | 367/1019 [29:08<59:27, 5.47s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▊ | 367/1019 [29:08<59:27, 5.47s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2414, 'learning_rate': 0.00021839999999999997, 'epoch': 0.36} + 36%|████████████████████████████▊ | 367/1019 [29:08<59:27, 5.47s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▉ | 368/1019 [29:13<58:51, 5.42s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▉ | 368/1019 [29:13<58:51, 5.42s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▉ | 369/1019 [29:19<58:39, 5.42s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▉ | 369/1019 [29:19<58:39, 5.42s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6271, 'learning_rate': 0.00021959999999999997, 'epoch': 0.36} + 36%|████████████████████████████▉ | 369/1019 [29:19<58:39, 5.42s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|████████████████████████████▉ | 369/1019 [29:19<58:39, 5.42s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████ | 370/1019 [29:24<58:21, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████ | 370/1019 [29:24<58:21, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████ | 370/1019 [29:24<58:21, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████ | 370/1019 [29:24<58:21, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8406, 'learning_rate': 0.00022079999999999997, 'epoch': 0.36} + 36%|█████████████████████████████ | 370/1019 [29:24<58:21, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▏ | 372/1019 [29:34<57:05, 5.29s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▏ | 372/1019 [29:34<57:05, 5.29s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4773, 'learning_rate': 0.0002214, 'epoch': 0.36} + 37%|█████████████████████████████▏ | 372/1019 [29:34<57:05, 5.29s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▎ | 373/1019 [29:40<56:33, 5.25s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▎ | 373/1019 [29:40<56:33, 5.25s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▎ | 374/1019 [29:45<56:13, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▎ | 374/1019 [29:45<56:13, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.61, 'learning_rate': 0.0002226, 'epoch': 0.37} + 37%|█████████████████████████████▎ | 374/1019 [29:45<56:13, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▍ | 375/1019 [29:50<55:46, 5.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▍ | 375/1019 [29:50<55:46, 5.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▌ | 376/1019 [29:55<55:10, 5.15s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▌ | 376/1019 [29:55<55:10, 5.15s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6115, 'learning_rate': 0.0002238, 'epoch': 0.37} + g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.382, 'learning_rate': 0.00022439999999999998, 'epoch': 0.37} + g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▋ | 378/1019 [30:05<54:10, 5.07s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▋ | 378/1019 [30:05<54:10, 5.07s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3134, 'learning_rate': 0.000225, 'epoch': 0.37} + 37%|█████████████████████████████▊ | 379/1019 [30:10<53:35, 5.02s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▊ | 379/1019 [30:10<53:35, 5.02s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7434, 'learning_rate': 0.00022559999999999998, 'epoch': 0.37} + 37%|█████████████████████████████▊ | 379/1019 [30:10<53:35, 5.02s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▊ | 380/1019 [30:15<53:14, 5.00s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▊ | 380/1019 [30:15<53:14, 5.00s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▉ | 381/1019 [30:20<52:38, 4.95s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▉ | 381/1019 [30:20<52:38, 4.95s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4288, 'learning_rate': 0.00022679999999999998, 'epoch': 0.37} + 37%|█████████████████████████████▉ | 382/1019 [30:24<51:57, 4.89s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|█████████████████████████████▉ | 382/1019 [30:24<51:57, 4.89s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4034, 'learning_rate': 0.00022739999999999997, 'epoch': 0.37} + 38%|██████████████████████████████ | 383/1019 [30:29<51:11, 4.83s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████ | 383/1019 [30:29<51:11, 4.83s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.715, 'learning_rate': 0.00022799999999999999, 'epoch': 0.38} + 38%|██████████████████████████████▏ | 384/1019 [30:34<50:20, 4.76s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▏ | 384/1019 [30:34<50:20, 4.76s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7009, 'learning_rate': 0.00022859999999999997, 'epoch': 0.38} + 38%|██████████████████████████████▏ | 385/1019 [30:38<49:31, 4.69s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▏ | 385/1019 [30:38<49:31, 4.69s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5989, 'learning_rate': 0.0002292, 'epoch': 0.38} + 38%|██████████████████████████████▎ | 386/1019 [30:43<48:33, 4.60s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▎ | 386/1019 [30:43<48:33, 4.60s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5708, 'learning_rate': 0.00022979999999999997, 'epoch': 0.38} + 38%|██████████████████████████████▍ | 387/1019 [30:47<47:31, 4.51s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▍ | 387/1019 [30:47<47:31, 4.51s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3128, 'learning_rate': 0.0002304, 'epoch': 0.38} + 38%|██████████████████████████████▍ | 388/1019 [30:51<46:36, 4.43s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▍ | 388/1019 [30:51<46:36, 4.43s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4654, 'learning_rate': 0.00023099999999999998, 'epoch': 0.38} + 38%|██████████████████████████████▌ | 389/1019 [30:55<45:29, 4.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▌ | 389/1019 [30:55<45:29, 4.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.388, 'learning_rate': 0.0002316, 'epoch': 0.38} + 38%|██████████████████████████████▌ | 389/1019 [30:55<45:29, 4.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▌ | 390/1019 [30:59<44:02, 4.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▌ | 390/1019 [30:59<44:02, 4.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▌ | 390/1019 [30:59<44:02, 4.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▋ | 391/1019 [31:03<42:35, 4.07s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▋ | 391/1019 [31:03<42:35, 4.07s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▋ | 391/1019 [31:03<42:35, 4.07s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▊ | 392/1019 [31:06<40:45, 3.90s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▊ | 392/1019 [31:06<40:45, 3.90s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:48,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:48,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|██████████████████████████████▉ | 394/1019 [31:13<36:18, 3.49s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|██████████████████████████████▉ | 394/1019 [31:13<36:18, 3.49s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:54,476 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:54,476 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:56,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:56,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:58,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:15:58,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:16:00,735 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:16:00,735 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:16:02,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:16:02,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9028, 'learning_rate': 0.0002376, 'epoch': 0.39} +[WARNING|modeling_utils.py:388] 2022-03-02 22:16:04,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:16:04,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:16:04,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▍ | 401/1019 [31:31<34:34, 3.36s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▍ | 401/1019 [31:31<34:34, 3.36s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▍ | 401/1019 [31:31<34:34, 3.36s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▌ | 402/1019 [31:37<42:37, 4.14s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▌ | 402/1019 [31:37<42:37, 4.14s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▌ | 402/1019 [31:37<42:37, 4.14s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▋ | 403/1019 [31:43<47:31, 4.63s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▋ | 403/1019 [31:43<47:31, 4.63s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▋ | 403/1019 [31:43<47:31, 4.63s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▋ | 404/1019 [31:49<50:38, 4.94s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▋ | 404/1019 [31:49<50:38, 4.94s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▋ | 404/1019 [31:49<50:38, 4.94s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▊ | 405/1019 [31:54<53:01, 5.18s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▊ | 405/1019 [31:54<53:01, 5.18s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▊ | 405/1019 [31:54<53:01, 5.18s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▊ | 406/1019 [32:00<54:39, 5.35s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▊ | 406/1019 [32:00<54:39, 5.35s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▊ | 406/1019 [32:00<54:39, 5.35s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▉ | 407/1019 [32:06<55:42, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▉ | 407/1019 [32:06<55:42, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▉ | 407/1019 [32:06<55:42, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▉ | 407/1019 [32:06<55:42, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 5.1687, 'learning_rate': 0.000243, 'epoch': 0.4} + 40%|███████████████████████████████▉ | 407/1019 [32:06<55:42, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|███████████████████████████████▉ | 407/1019 [32:06<55:42, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████ | 409/1019 [32:17<56:32, 5.56s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████ | 409/1019 [32:17<56:32, 5.56s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████ | 409/1019 [32:17<56:32, 5.56s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 410/1019 [32:23<56:36, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 410/1019 [32:23<56:36, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 410/1019 [32:23<56:36, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 410/1019 [32:23<56:36, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6197, 'learning_rate': 0.0002448, 'epoch': 0.4} + 40%|████████████████████████████████▏ | 410/1019 [32:23<56:36, 5.58s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▎ | 412/1019 [32:34<55:51, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▎ | 412/1019 [32:34<55:51, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6007, 'learning_rate': 0.00024539999999999995, 'epoch': 0.4} + 40%|████████████████████████████████▎ | 412/1019 [32:34<55:51, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▍ | 413/1019 [32:39<55:45, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▍ | 413/1019 [32:39<55:45, 5.52s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▌ | 414/1019 [32:45<55:33, 5.51s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▌ | 414/1019 [32:45<55:33, 5.51s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8696, 'learning_rate': 0.0002466, 'epoch': 0.41} + 41%|████████████████████████████████▌ | 415/1019 [32:50<54:54, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▌ | 415/1019 [32:50<54:54, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.398, 'learning_rate': 0.0002472, 'epoch': 0.41} + 41%|████████████████████████████████▌ | 415/1019 [32:50<54:54, 5.46s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▋ | 416/1019 [32:55<54:34, 5.43s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▋ | 416/1019 [32:55<54:34, 5.43s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▋ | 417/1019 [33:01<54:05, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▋ | 417/1019 [33:01<54:05, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.412, 'learning_rate': 0.00024839999999999997, 'epoch': 0.41} + 41%|████████████████████████████████▊ | 418/1019 [33:06<53:58, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 418/1019 [33:06<53:58, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6433, 'learning_rate': 0.000249, 'epoch': 0.41} + 41%|████████████████████████████████▊ | 418/1019 [33:06<53:58, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 418/1019 [33:06<53:58, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6482, 'learning_rate': 0.00024959999999999994, 'epoch': 0.41} + 41%|████████████████████████████████▊ | 418/1019 [33:06<53:58, 5.39s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▉ | 420/1019 [33:17<53:03, 5.31s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▉ | 420/1019 [33:17<53:03, 5.31s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6418, 'learning_rate': 0.00025019999999999996, 'epoch': 0.41} + 41%|████████████████████████████████▉ | 420/1019 [33:17<53:03, 5.31s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████ | 421/1019 [33:22<52:50, 5.30s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████ | 421/1019 [33:22<52:50, 5.30s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 422/1019 [33:27<52:35, 5.29s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 422/1019 [33:27<52:35, 5.29s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6701, 'learning_rate': 0.0002514, 'epoch': 0.41} + 41%|█████████████████████████████████▏ | 422/1019 [33:27<52:35, 5.29s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▏ | 423/1019 [33:32<51:54, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▏ | 423/1019 [33:32<51:54, 5.23s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▎ | 424/1019 [33:37<51:39, 5.21s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▎ | 424/1019 [33:37<51:39, 5.21s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 5.0263, 'learning_rate': 0.00025259999999999996, 'epoch': 0.42} + 42%|█████████████████████████████████▎ | 424/1019 [33:37<51:39, 5.21s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▎ | 425/1019 [33:42<51:11, 5.17s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▎ | 425/1019 [33:42<51:11, 5.17s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▍ | 426/1019 [33:48<50:38, 5.12s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▍ | 426/1019 [33:48<50:38, 5.12s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8681, 'learning_rate': 0.0002538, 'epoch': 0.42} + g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2982, 'learning_rate': 0.00025439999999999995, 'epoch': 0.42} + g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▌ | 428/1019 [33:57<49:40, 5.04s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▌ | 428/1019 [33:57<49:40, 5.04s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5738, 'learning_rate': 0.00025499999999999996, 'epoch': 0.42} + 42%|█████████████████████████████████▋ | 429/1019 [34:02<49:07, 5.00s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▋ | 429/1019 [34:02<49:07, 5.00s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.549, 'learning_rate': 0.0002556, 'epoch': 0.42} + 42%|█████████████████████████████████▋ | 429/1019 [34:02<49:07, 5.00s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 430/1019 [34:07<48:48, 4.97s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 430/1019 [34:07<48:48, 4.97s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 431/1019 [34:12<48:13, 4.92s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 431/1019 [34:12<48:13, 4.92s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.426, 'learning_rate': 0.00025679999999999995, 'epoch': 0.42} + 42%|█████████████████████████████████▉ | 432/1019 [34:17<47:30, 4.86s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▉ | 432/1019 [34:17<47:30, 4.86s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5729, 'learning_rate': 0.00025739999999999997, 'epoch': 0.42} + 42%|█████████████████████████████████▉ | 432/1019 [34:17<47:30, 4.86s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▉ | 433/1019 [34:21<46:55, 4.80s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▉ | 433/1019 [34:21<46:55, 4.80s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████ | 434/1019 [34:26<46:06, 4.73s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████ | 434/1019 [34:26<46:06, 4.73s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5049, 'learning_rate': 0.0002586, 'epoch': 0.43} + 43%|██████████████████████████████████▏ | 435/1019 [34:30<45:19, 4.66s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▏ | 435/1019 [34:30<45:19, 4.66s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4762, 'learning_rate': 0.00025919999999999996, 'epoch': 0.43} + 43%|██████████████████████████████████▏ | 436/1019 [34:35<44:22, 4.57s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▏ | 436/1019 [34:35<44:22, 4.57s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8848, 'learning_rate': 0.00025979999999999997, 'epoch': 0.43} + 43%|██████████████████████████████████▎ | 437/1019 [34:39<43:38, 4.50s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▎ | 437/1019 [34:39<43:38, 4.50s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4942, 'learning_rate': 0.0002604, 'epoch': 0.43} + 43%|██████████████████████████████████▎ | 437/1019 [34:39<43:38, 4.50s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 438/1019 [34:43<42:43, 4.41s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 438/1019 [34:43<42:43, 4.41s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 438/1019 [34:43<42:43, 4.41s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 439/1019 [34:48<41:49, 4.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 439/1019 [34:48<41:49, 4.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 439/1019 [34:48<41:49, 4.33s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▌ | 440/1019 [34:51<40:31, 4.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▌ | 440/1019 [34:51<40:31, 4.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▌ | 440/1019 [34:51<40:31, 4.20s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▌ | 441/1019 [34:55<39:06, 4.06s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▌ | 441/1019 [34:55<39:06, 4.06s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▌ | 441/1019 [34:55<39:06, 4.06s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▋ | 442/1019 [34:59<37:47, 3.93s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:41,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:41,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4947, 'learning_rate': 0.00026399999999999997, 'epoch': 0.43} +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:41,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|██████████████████████████████████▊ | 444/1019 [35:05<34:07, 3.56s/it]g-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:47,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:47,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:49,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:49,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 5.3024, 'learning_rate': 0.00026579999999999996, 'epoch': 0.44} +[WARNING|modeling_utils.py:388] 2022-03-02 22:19:49,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:11:59,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████ | 447/1019 [35:13<27:06, 2.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:19:53,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████ | 447/1019 [35:13<27:06, 2.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:19:53,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▎ | 449/1019 [35:16<21:48, 2.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:19:54,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▎ | 449/1019 [35:16<21:48, 2.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:19:54,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▎ | 450/1019 [35:18<20:48, 2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:19:56,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▎ | 450/1019 [35:18<20:48, 2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:19:56,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▎ | 450/1019 [35:18<20:48, 2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▎ | 450/1019 [35:18<20:48, 2.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 451/1019 [35:25<32:41, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 451/1019 [35:25<32:41, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 451/1019 [35:25<32:41, 3.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 452/1019 [35:31<39:53, 4.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 452/1019 [35:31<39:53, 4.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▌ | 453/1019 [35:37<44:26, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▌ | 453/1019 [35:37<44:26, 4.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5276, 'learning_rate': 0.00027, 'epoch': 0.44} + 45%|███████████████████████████████████▋ | 454/1019 [35:43<47:43, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▋ | 454/1019 [35:43<47:43, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8693, 'learning_rate': 0.00027059999999999996, 'epoch': 0.45} + 45%|███████████████████████████████████▋ | 454/1019 [35:43<47:43, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▋ | 455/1019 [35:48<49:33, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▋ | 455/1019 [35:48<49:33, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▋ | 455/1019 [35:48<49:33, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▊ | 456/1019 [35:54<50:50, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▊ | 456/1019 [35:54<50:50, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▊ | 456/1019 [35:54<50:50, 5.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▉ | 457/1019 [36:00<51:57, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▉ | 457/1019 [36:00<51:57, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▉ | 457/1019 [36:00<51:57, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████████████████████████████████▉ | 457/1019 [36:00<51:57, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5308, 'learning_rate': 0.00027299999999999997, 'epoch': 0.45} + 45%|███████████████████████████████████▉ | 457/1019 [36:00<51:57, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████ | 459/1019 [36:11<52:28, 5.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████ | 459/1019 [36:11<52:28, 5.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6579, 'learning_rate': 0.0002736, 'epoch': 0.45} + 45%|████████████████████████████████████ | 459/1019 [36:11<52:28, 5.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████ | 460/1019 [36:17<52:14, 5.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████ | 460/1019 [36:17<52:14, 5.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████ | 460/1019 [36:17<52:14, 5.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▏ | 461/1019 [36:22<52:00, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▏ | 461/1019 [36:22<52:00, 5.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 462/1019 [36:28<51:45, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|███████���████████████████████████████▎ | 462/1019 [36:28<51:45, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9032, 'learning_rate': 0.00027539999999999997, 'epoch': 0.45} + 45%|████████████████████████████████████▎ | 463/1019 [36:34<51:29, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 463/1019 [36:34<51:29, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9936, 'learning_rate': 0.000276, 'epoch': 0.45} + 45%|████████████████████████████████████▎ | 463/1019 [36:34<51:29, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▍ | 464/1019 [36:39<51:16, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▍ | 464/1019 [36:39<51:16, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▍ | 464/1019 [36:39<51:16, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▌ | 465/1019 [36:44<50:45, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▌ | 465/1019 [36:44<50:45, 5.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▌ | 466/1019 [36:50<50:24, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▌ | 466/1019 [36:50<50:24, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9673, 'learning_rate': 0.0002778, 'epoch': 0.46} + 46%|████████████████████████████████████▌ | 466/1019 [36:50<50:24, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▋ | 467/1019 [36:55<50:16, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▋ | 467/1019 [36:55<50:16, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▋ | 467/1019 [36:55<50:16, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▋ | 467/1019 [36:55<50:16, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5706, 'learning_rate': 0.000279, 'epoch': 0.46} + 46%|████████████████████████████████████▋ | 467/1019 [36:55<50:16, 5.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▊ | 469/1019 [37:06<49:37, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▊ | 469/1019 [37:06<49:37, 5.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4252, 'learning_rate': 0.00027959999999999997, 'epoch': 0.46} + 46%|████████████████████████████████████▉ | 470/1019 [37:11<49:19, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 470/1019 [37:11<49:19, 5.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.623, 'learning_rate': 0.0002802, 'epoch': 0.46} + 46%|████████████████████████████████████▉ | 471/1019 [37:17<48:58, 5.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 471/1019 [37:17<48:58, 5.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5258, 'learning_rate': 0.0002808, 'epoch': 0.46} + 46%|█████████████████████████████████████ | 472/1019 [37:22<48:19, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████ | 472/1019 [37:22<48:19, 5.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 5.0276, 'learning_rate': 0.00028139999999999996, 'epoch': 0.46} + 46%|█████████████████████████████████████▏ | 473/1019 [37:27<47:56, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▏ | 473/1019 [37:27<47:56, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6158, 'learning_rate': 0.00028199999999999997, 'epoch': 0.46} + 46%|█████████████████████████████████████▏ | 473/1019 [37:27<47:56, 5.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▏ | 474/1019 [37:32<48:16, 5.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▏ | 474/1019 [37:32<48:16, 5.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▎ | 475/1019 [37:38<47:58, 5.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▎ | 475/1019 [37:38<47:58, 5.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.499, 'learning_rate': 0.00028319999999999994, 'epoch': 0.47} + 47%|█████████████████████████████████████▎ | 476/1019 [37:43<47:24, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▎ | 476/1019 [37:43<47:24, 5.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9739, 'learning_rate': 0.00028379999999999996, 'epoch': 0.47} + 47%|█████████████████████████████████████▍ | 477/1019 [37:48<46:44, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▍ | 477/1019 [37:48<46:44, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4816, 'learning_rate': 0.0002844, 'epoch': 0.47} + 47%|█████████████████████████████████████▍ | 477/1019 [37:48<46:44, 5.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▌ | 478/1019 [37:53<46:02, 5.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▌ | 478/1019 [37:53<46:02, 5.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▌ | 479/1019 [37:58<45:33, 5.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▌ | 479/1019 [37:58<45:33, 5.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6396, 'learning_rate': 0.00028559999999999995, 'epoch': 0.47} + 47%|█████████████████████████████████████▋ | 480/1019 [38:03<45:09, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▋ | 480/1019 [38:03<45:09, 5.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5892, 'learning_rate': 0.00028619999999999996, 'epoch': 0.47} + 47%|█████████████████████████████████████▊ | 481/1019 [38:08<44:36, 4.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▊ | 481/1019 [38:08<44:36, 4.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7315, 'learning_rate': 0.0002868, 'epoch': 0.47} + 47%|█████████████████████████████████████▊ | 482/1019 [38:12<44:06, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▊ | 482/1019 [38:12<44:06, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6936, 'learning_rate': 0.00028739999999999994, 'epoch': 0.47} + 47%|█████████████████████████████████████▊ | 482/1019 [38:12<44:06, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▊ | 482/1019 [38:12<44:06, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4115, 'learning_rate': 0.00028799999999999995, 'epoch': 0.47} + 47%|█████████████████████████████████████▊ | 482/1019 [38:12<44:06, 4.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 484/1019 [38:22<43:35, 4.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 484/1019 [38:22<43:35, 4.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8432, 'learning_rate': 0.00028859999999999997, 'epoch': 0.47} + 48%|██████████████████████████████████████ | 485/1019 [38:27<42:36, 4.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████ | 485/1019 [38:27<42:36, 4.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2421, 'learning_rate': 0.0002892, 'epoch': 0.48} + 48%|██████████████████████████████████████ | 485/1019 [38:27<42:36, 4.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▏ | 486/1019 [38:31<41:58, 4.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▏ | 486/1019 [38:31<41:58, 4.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▏ | 486/1019 [38:31<41:58, 4.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▏ | 487/1019 [38:36<40:59, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▏ | 487/1019 [38:36<40:59, 4.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▎ | 488/1019 [38:40<40:07, 4.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▎ | 488/1019 [38:40<40:07, 4.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5901, 'learning_rate': 0.00029099999999999997, 'epoch': 0.48} + 48%|██████████████████████████████████████▍ | 489/1019 [38:44<39:10, 4.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▍ | 489/1019 [38:44<39:10, 4.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7235, 'learning_rate': 0.0002916, 'epoch': 0.48} + 48%|██████████████████████████████████████▍ | 490/1019 [38:48<38:12, 4.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▍ | 490/1019 [38:48<38:12, 4.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4504, 'learning_rate': 0.00029219999999999995, 'epoch': 0.48} + 48%|██████████████████████████████████████▌ | 491/1019 [38:52<36:47, 4.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 491/1019 [38:52<36:47, 4.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8153, 'learning_rate': 0.00029279999999999996, 'epoch': 0.48} + 48%|██████████████████████████████████████▌ | 491/1019 [38:52<36:47, 4.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▋ | 492/1019 [38:56<35:12, 4.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▋ | 492/1019 [38:56<35:12, 4.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▋ | 492/1019 [38:56<35:12, 4.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:20:00,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▋ | 493/1019 [38:59<33:16, 3.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:39,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▋ | 493/1019 [38:59<33:16, 3.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:39,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▊ | 494/1019 [39:02<31:23, 3.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:39,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▊ | 494/1019 [39:02<31:23, 3.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:39,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▊ | 494/1019 [39:02<31:23, 3.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:39,805 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|██████████████████████████████████████▊ | 495/1019 [39:05<29:26, 3.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:45,562 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|██████████████████████████████████████▊ | 495/1019 [39:05<29:26, 3.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:45,562 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|██████████████████████████████████████▉ | 496/1019 [39:07<27:22, 3.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:47,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|██████████████████████████████████████▉ | 496/1019 [39:07<27:22, 3.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:47,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████ | 497/1019 [39:10<25:09, 2.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:50,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████ | 497/1019 [39:10<25:09, 2.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:50,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████ | 498/1019 [39:12<22:46, 2.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:52,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████ | 498/1019 [39:12<22:46, 2.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:52,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▏ | 499/1019 [39:14<20:26, 2.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:53,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▏ | 499/1019 [39:14<20:26, 2.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:23:53,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[INFO|trainer.py:2366] 2022-03-02 22:23:54,903 >> Num examples = 2642 | 500/1019 [39:16<19:15, 2.23s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2366] 2022-03-02 22:23:54,903 >> Num examples = 2642 | 500/1019 [39:16<19:15, 2.23s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +[INFO|trainer.py:2366] 2022-03-02 22:23:54,903 >> Num examples = 2642 | 500/1019 [39:16<19:15, 2.23s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 2%|█▎ | 3/189 [00:06<06:55, 2.24s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 2%|█▊ | 4/189 [00:09<07:47, 2.53s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 3%|██▏ | 5/189 [00:12<08:59, 2.93s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 3%|██▋ | 6/189 [00:17<10:11, 3.34s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 4%|███ | 7/189 [00:20<10:02, 3.31s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 4%|███▌ | 8/189 [00:23<09:53, 3.28s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 5%|███▉ | 9/189 [00:28<11:13, 3.74s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 5%|████▎ | 10/189 [00:32<11:29, 3.85s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 6%|████▊ | 11/189 [00:35<10:56, 3.69s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. + 6%|█████▏ | 12/189 [00:39<10:54, 3.70s/it][INFO|trainer.py:560] 2022-03-02 22:23:54,900 >> The following columns in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +RuntimeError: CUDA out of memory. Tried to allocate 1.69 GiB (GPU 0; 15.78 GiB total capacity; 9.19 GiB already allocated; 1.65 GiB free; 12.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONFare not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. +RuntimeError: CUDA out of memory. Tried to allocate 1.69 GiB (GPU 0; 15.78 GiB total capacity; 9.19 GiB already allocated; 1.65 GiB free; 12.44 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONFare not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. \ No newline at end of file