diff --git "a/wandb/run-20220301_160718-1tlgvk9e/files/output.log" "b/wandb/run-20220301_160718-1tlgvk9e/files/output.log" new file mode 100644--- /dev/null +++ "b/wandb/run-20220301_160718-1tlgvk9e/files/output.log" @@ -0,0 +1,1697 @@ + + + 0%| | 0/254 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:26,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:30,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:33,200 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:36,326 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:39,413 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:42,476 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7957, 'learning_rate': 6.000000000000001e-08, 'epoch': 0.0} +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:45,517 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 0%|▎ | 1/254 [00:25<1:48:54, 25.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:07:48,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:52,039 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:55,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:07:58,137 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:01,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:04,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:07,067 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8054, 'learning_rate': 1.2000000000000002e-07, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:10,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|▋ | 2/254 [00:50<1:44:51, 24.96s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:08:13,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:16,060 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:19,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:21,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:24,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:27,880 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:30,838 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9484, 'learning_rate': 1.8e-07, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:33,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|▉ | 3/254 [01:13<1:41:59, 24.38s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:08:36,814 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:39,678 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:42,605 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:45,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:48,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:51,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:54,270 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:08:57,195 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▎ | 4/254 [01:37<1:40:02, 24.01s/it] + + 2%|█▎ | 4/254 [01:37<1:40:02, 24.01s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:09:00,263 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:03,156 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:06,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:08,945 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:11,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:14,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:17,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:20,294 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▌ | 5/254 [02:00<1:38:16, 23.68s/it] + + 2%|█▌ | 5/254 [02:00<1:38:16, 23.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:09:23,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:26,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:28,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:31,784 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:34,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:37,555 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:40,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9159, 'learning_rate': 3.0000000000000004e-07, 'epoch': 0.02} +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:43,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 2%|█▉ | 6/254 [02:23<1:36:57, 23.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:09:46,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:49,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:52,042 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:54,850 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:09:57,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:00,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:03,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:06,245 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▏ | 7/254 [02:46<1:35:51, 23.29s/it] + + 3%|██▏ | 7/254 [02:46<1:35:51, 23.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:10:09,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:12,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:15,054 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:17,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:20,707 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:23,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:26,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9834, 'learning_rate': 4.2e-07, 'epoch': 0.03} +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:29,195 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 3%|██▌ | 8/254 [03:09<1:35:01, 23.18s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:10:32,136 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:34,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:37,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:40,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:43,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:46,277 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:49,108 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.897, 'learning_rate': 4.800000000000001e-07, 'epoch': 0.04} +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:51,900 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 4%|██▊ | 9/254 [03:32<1:34:02, 23.03s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:10:54,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:10:57,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:00,420 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:03,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:06,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:08,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:11,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7564, 'learning_rate': 5.4e-07, 'epoch': 0.04} +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:14,478 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 4%|███▏ | 10/254 [03:54<1:33:05, 22.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:11:17,385 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:20,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:22,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:25,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:28,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:31,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:33,923 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:36,681 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▍ | 11/254 [04:16<1:31:51, 22.68s/it] + + 4%|███▍ | 11/254 [04:16<1:31:51, 22.68s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:11:39,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:42,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:45,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:47,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:50,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:53,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:56,111 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:11:58,865 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|███▊ | 12/254 [04:38<1:30:51, 22.53s/it] + + 5%|███▊ | 12/254 [04:38<1:30:51, 22.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:12:01,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:04,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:07,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:10,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:13,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:15,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:18,684 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:21,433 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 13/254 [05:01<1:30:32, 22.54s/it] + 5%|████ | 13/254 [05:01<1:30:32, 22.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 13/254 [05:01<1:30:32, 22.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:29,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:29,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:35,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:35,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:40,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:23<1:29:20, 22.34s/it]g-point operations will not be computed-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:23<1:29:20, 22.34s/it]g-point operations will not be computed-01 16:12:24,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:23<1:29:20, 22.34s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:23<1:29:20, 22.34s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:51,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:51,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:56,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:12:56,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:02,294 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:02,294 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 15/254 [05:45<1:28:10, 22.14s/it]g-point operations will not be computed-01 16:12:46,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 15/254 [05:45<1:28:10, 22.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 15/254 [05:45<1:28:10, 22.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:18,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:18,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:23,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:23,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 16/254 [06:06<1:27:13, 21.99s/it]g-point operations will not be computed-01 16:13:07,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 16/254 [06:06<1:27:13, 21.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 16/254 [06:06<1:27:13, 21.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:34,784 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:34,784 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:40,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:40,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:45,622 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 17/254 [06:28<1:26:28, 21.89s/it]g-point operations will not be computed-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 17/254 [06:28<1:26:28, 21.89s/it]g-point operations will not be computed-01 16:13:29,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 17/254 [06:28<1:26:28, 21.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 17/254 [06:28<1:26:28, 21.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:56,332 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:13:56,332 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:01,561 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:01,561 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:06,900 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 18/254 [06:49<1:25:20, 21.70s/it]g-point operations will not be computed-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 18/254 [06:49<1:25:20, 21.70s/it]g-point operations will not be computed-01 16:13:51,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 18/254 [06:49<1:25:20, 21.70s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 18/254 [06:49<1:25:20, 21.70s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:17,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:17,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:22,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:22,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:28,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:28,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:28,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 19/254 [07:10<1:24:25, 21.55s/it]g-point operations will not be computed-01 16:14:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 19/254 [07:10<1:24:25, 21.55s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 19/254 [07:10<1:24:25, 21.55s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:38,777 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:38,777 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:44,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:44,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:49,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:49,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 20/254 [07:32<1:23:36, 21.44s/it]g-point operations will not be computed-01 16:14:33,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 20/254 [07:32<1:23:36, 21.44s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 20/254 [07:32<1:23:36, 21.44s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:59,783 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:14:59,783 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:05,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:05,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:10,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:10,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 21/254 [07:52<1:22:38, 21.28s/it]g-point operations will not be computed-01 16:14:54,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 21/254 [07:52<1:22:38, 21.28s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 21/254 [07:52<1:22:38, 21.28s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:20,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:20,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:25,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:25,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:31,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:13<1:21:41, 21.13s/it]g-point operations will not be computed-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:13<1:21:41, 21.13s/it]g-point operations will not be computed-01 16:15:15,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:13<1:21:41, 21.13s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:13<1:21:41, 21.13s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:41,483 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:41,483 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:46,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:46,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:15:51,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:34<1:20:49, 20.99s/it]g-point operations will not be computed-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:34<1:20:49, 20.99s/it]g-point operations will not be computed-01 16:15:36,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:34<1:20:49, 20.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:34<1:20:49, 20.99s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:02,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:02,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:07,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:07,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:12,309 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:54<1:20:00, 20.87s/it]g-point operations will not be computed-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:54<1:20:00, 20.87s/it]g-point operations will not be computed-01 16:15:56,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:54<1:20:00, 20.87s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:54<1:20:00, 20.87s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:22,569 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:22,569 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:27,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:27,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:32,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:16:32,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it]g-point operations will not be computed-01 16:16:17,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:15<1:19:34, 20.85s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:35<1:18:26, 20.64s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:55<1:17:18, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7121, 'learning_rate': 1.62e-06, 'epoch': 0.11} + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:15<1:16:25, 20.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5791, 'learning_rate': 1.68e-06, 'epoch': 0.11} + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.727, 'learning_rate': 1.74e-06, 'epoch': 0.12} + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:35<1:15:31, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [11:14<1:13:36, 19.80s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6644, 'learning_rate': 1.86e-06, 'epoch': 0.13} + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:33<1:12:17, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6596, 'learning_rate': 1.9200000000000003e-06, 'epoch': 0.13} + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:52<1:11:04, 19.30s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [12:10<1:09:42, 19.01s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [12:10<1:09:42, 19.01s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6192, 'learning_rate': 1.98e-06, 'epoch': 0.13} + 13%|██████████▋ | 34/254 [12:10<1:09:42, 19.01s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [12:10<1:09:42, 19.01s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [12:10<1:09:42, 19.01s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [12:10<1:09:42, 19.01s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6084, 'learning_rate': 2.0400000000000004e-06, 'epoch': 0.14} +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:19:46,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:46<1:07:08, 18.48s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:46<1:07:08, 18.48s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6157, 'learning_rate': 2.1000000000000002e-06, 'epoch': 0.14} + 14%|███████████▎ | 36/254 [12:46<1:07:08, 18.48s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:46<1:07:08, 18.48s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:46<1:07:08, 18.48s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:20:19,644 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:20:19,644 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7017, 'learning_rate': 2.16e-06, 'epoch': 0.15} + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [13:04<1:05:39, 18.15s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6331, 'learning_rate': 2.22e-06, 'epoch': 0.15} + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:21<1:04:42, 17.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7079, 'learning_rate': 2.28e-06, 'epoch': 0.15} + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:37<1:02:44, 17.51s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▌ | 40/254 [13:53<1:00:30, 16.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▌ | 40/254 [13:53<1:00:30, 16.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5542, 'learning_rate': 2.34e-06, 'epoch': 0.16} + 16%|████████████▌ | 40/254 [13:53<1:00:30, 16.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▌ | 40/254 [13:53<1:00:30, 16.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▌ | 40/254 [13:53<1:00:30, 16.97s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:24,759 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:24,759 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:24,759 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [14:08<57:54, 16.31s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [14:08<57:54, 16.31s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [14:08<57:54, 16.31s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [14:08<57:54, 16.31s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:37,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:37,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:37,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:37,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▌ | 42/254 [14:22<55:02, 15.58s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▌ | 42/254 [14:22<55:02, 15.58s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:47,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:47,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:47,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:53,406 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:21:53,406 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▉ | 43/254 [14:35<51:45, 14.72s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▉ | 43/254 [14:35<51:45, 14.72s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▉ | 43/254 [14:35<51:45, 14.72s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:00,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:00,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:05,122 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:05,122 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|██████████████▏ | 44/254 [14:46<48:13, 13.78s/it]g-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:09,259 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:09,259 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:13,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:13,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:13,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:16,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:19,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:19,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:22,945 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:25,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:25,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:16:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████▊ | 46/254 [15:06<40:44, 11.75s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:22:27,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:29,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:27,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:31,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:27,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:33,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:27,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:33,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:27,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▏ | 47/254 [15:14<36:55, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:22:35,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:37,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:35,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:39,245 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:35,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▍ | 48/254 [15:21<33:12, 9.67s/it]g-point operations will not be computed-01 16:22:35,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▍ | 48/254 [15:21<33:12, 9.67s/it]g-point operations will not be computed-01 16:22:35,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▍ | 48/254 [15:21<33:12, 9.67s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:22:42,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:45,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:42,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:47,386 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:42,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:47,386 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:42,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 19%|███████████████▊ | 49/254 [15:28<29:33, 8.65s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:22:48,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:22:51,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:48,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▏ | 50/254 [15:33<26:17, 7.73s/it]g-point operations will not be computed-01 16:22:48,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▏ | 50/254 [15:33<26:17, 7.73s/it]g-point operations will not be computed-01 16:22:48,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▏ | 50/254 [15:33<26:17, 7.73s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▏ | 50/254 [15:33<26:17, 7.73s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:23:03,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:23:03,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:23:09,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:23:09,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:23:09,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4442, 'learning_rate': 3e-06, 'epoch': 0.2} + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▍ | 51/254 [15:58<43:24, 12.83s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3581, 'learning_rate': 3.06e-06, 'epoch': 0.2} + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [16:22<54:17, 16.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3176, 'learning_rate': 3.1199999999999998e-06, 'epoch': 0.21} + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3466, 'learning_rate': 3.18e-06, 'epoch': 0.21} + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|██████████████���█▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:46<1:01:40, 18.41s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:32<1:09:07, 20.84s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3074, 'learning_rate': 3.3e-06, 'epoch': 0.22} + 22%|█████████████████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|████████████��████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:55<1:11:05, 21.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3574, 'learning_rate': 3.36e-06, 'epoch': 0.22} + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [18:18<1:11:50, 21.88s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3707, 'learning_rate': 3.4200000000000003e-06, 'epoch': 0.23} + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:41<1:12:25, 22.17s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3817, 'learning_rate': 3.54e-06, 'epoch': 0.24} + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [19:03<1:12:35, 22.33s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:48<1:12:07, 22.42s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3031, 'learning_rate': 3.66e-06, 'epoch': 0.24} + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [20:11<1:11:34, 22.37s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3996, 'learning_rate': 3.72e-06, 'epoch': 0.25} + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:34<1:11:39, 22.51s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.39, 'learning_rate': 3.7800000000000002e-06, 'epoch': 0.25} + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3368, 'learning_rate': 3.8400000000000005e-06, 'epoch': 0.26} + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:56<1:10:53, 22.39s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3278, 'learning_rate': 3.96e-06, 'epoch': 0.26} + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:39<1:09:19, 22.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3772, 'learning_rate': 4.0200000000000005e-06, 'epoch': 0.27} + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [22:23<1:07:46, 21.86s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3784, 'learning_rate': 4.080000000000001e-06, 'epoch': 0.27} + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:44<1:07:02, 21.74s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [23:05<1:06:15, 21.61s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▎ | 71/254 [23:26<1:05:21, 21.43s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:47<1:04:29, 21.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.297, 'learning_rate': 4.32e-06, 'epoch': 0.29} + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [24:08<1:03:38, 21.10s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3443, 'learning_rate': 4.3799999999999996e-06, 'epoch': 0.29} + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [24:29<1:02:53, 20.97s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3627, 'learning_rate': 4.44e-06, 'epoch': 0.29} + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:50<1:02:25, 20.92s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.374, 'learning_rate': 4.5e-06, 'epoch': 0.3} + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [25:10<1:01:29, 20.73s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3055, 'learning_rate': 4.56e-06, 'epoch': 0.3} + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▎ | 77/254 [25:30<1:00:36, 20.54s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:50<59:35, 20.32s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [26:09<58:42, 20.13s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|██████��██████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [26:29<57:42, 19.90s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2786, 'learning_rate': 4.800000000000001e-06, 'epoch': 0.32} + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:48<56:52, 19.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2899, 'learning_rate': 4.86e-06, 'epoch': 0.32} + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.291, 'learning_rate': 4.92e-06, 'epoch': 0.33} + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [27:26<54:53, 19.26s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3062, 'learning_rate': 4.980000000000001e-06, 'epoch': 0.33} + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:44<53:46, 18.98s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4088, 'learning_rate': 5.04e-06, 'epoch': 0.33} + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [28:02<52:35, 18.67s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2689, 'learning_rate': 5.1e-06, 'epoch': 0.34} + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [28:20<51:20, 18.34s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [28:37<49:55, 17.94s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3251, 'learning_rate': 5.22e-06, 'epoch': 0.35} + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:54<49:01, 17.72s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4222, 'learning_rate': 5.279999999999999e-06, 'epoch': 0.35} + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████���███████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [29:10<47:19, 17.21s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|█████████████████████████████ | 90/254 [29:25<45:32, 16.66s/it]g-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:36:49,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:36:49,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:36:49,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:36:49,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:36:49,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:36:49,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:36:49,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:22:57,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████▍ | 91/254 [29:39<43:15, 15.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████▍ | 91/254 [29:39<43:15, 15.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████▍ | 91/254 [29:39<43:15, 15.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████▍ | 91/254 [29:39<43:15, 15.93s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:09,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:09,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████▋ | 92/254 [29:53<40:54, 15.15s/it]g-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████▋ | 92/254 [29:53<40:54, 15.15s/it]g-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4989, 'learning_rate': 5.46e-06, 'epoch': 0.36} + 36%|█████████████████████████████▋ | 92/254 [29:53<40:54, 15.15s/it]g-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:19,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:19,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:19,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:25,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:25,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4136, 'learning_rate': 5.52e-06, 'epoch': 0.36} +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:29,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:29,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:29,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:35,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|██████████████████████████████▎ | 94/254 [30:16<35:47, 13.42s/it]g-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|██████████████████████████████▎ | 94/254 [30:16<35:47, 13.42s/it]g-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:39,483 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:42,030 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:42,030 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:45,752 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:01,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|██████████████████████████████▋ | 95/254 [30:27<33:01, 12.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|██████████████████████████████▋ | 95/254 [30:27<33:01, 12.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3593, 'learning_rate': 5.64e-06, 'epoch': 0.37} +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:51,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:53,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:53,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:56,114 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:37:58,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:00,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:02,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:02,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:04,048 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:05,967 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:07,721 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:09,410 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:09,410 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:12,704 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:14,193 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:15,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:15,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:18,446 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:19,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:22,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:22,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.6817, 'learning_rate': 5.940000000000001e-06, 'epoch': 0.39} +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:29,018 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:29,018 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:35,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:35,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:41,240 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:38:41,240 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3434, 'learning_rate': 6e-06, 'epoch': 0.4} + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▏ | 101/254 [31:27<32:16, 12.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:51<40:32, 16.00s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.294, 'learning_rate': 6.12e-06, 'epoch': 0.4} + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [32:14<45:54, 18.24s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [32:37<49:23, 19.75s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.335, 'learning_rate': 6.2399999999999995e-06, 'epoch': 0.41} + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▍ | 105/254 [33:00<51:30, 20.74s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2597, 'learning_rate': 6.3e-06, 'epoch': 0.42} + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [33:24<53:00, 21.49s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.279, 'learning_rate': 6.42e-06, 'epoch': 0.42} + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [34:09<53:59, 22.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2762, 'learning_rate': 6.48e-06, 'epoch': 0.43} + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [34:32<53:53, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2953, 'learning_rate': 6.54e-06, 'epoch': 0.43} + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:54<53:38, 22.35s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [35:16<52:59, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2245, 'learning_rate': 6.660000000000001e-06, 'epoch': 0.44} + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [35:39<52:36, 22.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1598, 'learning_rate': 6.72e-06, 'epoch': 0.44} + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [36:01<52:24, 22.30s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.256, 'learning_rate': 6.78e-06, 'epoch': 0.45} + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [36:23<51:35, 22.11s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2567, 'learning_rate': 6.840000000000001e-06, 'epoch': 0.45} + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [37:06<50:09, 21.81s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [37:27<49:26, 21.65s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2549, 'learning_rate': 7.0200000000000006e-06, 'epoch': 0.46} + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [37:48<48:39, 21.47s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|█████████████████████████████████████▉ | 119/254 [38:09<48:01, 21.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 47%|██████████████████████████████████████▎ | 120/254 [38:30<47:21, 21.21s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2409, 'learning_rate': 7.2e-06, 'epoch': 0.47} + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [38:51<46:36, 21.03s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3067, 'learning_rate': 7.26e-06, 'epoch': 0.48} + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [39:11<45:52, 20.85s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [39:31<45:11, 20.70s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2309, 'learning_rate': 7.3800000000000005e-06, 'epoch': 0.49} + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [39:52<44:29, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3182, 'learning_rate': 7.44e-06, 'epoch': 0.49} + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [40:12<44:08, 20.53s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2016, 'learning_rate': 7.5e-06, 'epoch': 0.49} + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [40:32<43:23, 20.34s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2832, 'learning_rate': 7.5600000000000005e-06, 'epoch': 0.5} + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [40:52<42:39, 20.15s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [41:11<41:59, 19.99s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2705, 'learning_rate': 7.680000000000001e-06, 'epoch': 0.51} + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [41:31<41:15, 19.80s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|██████████████████████████████████████���██▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [41:50<40:33, 19.62s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [42:09<39:48, 19.42s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████�� | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1946, 'learning_rate': 7.860000000000001e-06, 'epoch': 0.52} + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [42:28<39:05, 19.23s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████���████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1468, 'learning_rate': 7.98e-06, 'epoch': 0.53} + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [42:46<38:23, 19.04s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|███████████████████████████████████████████ | 135/254 [43:22<36:35, 18.45s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|███████████████████████████████████████████ | 135/254 [43:22<36:35, 18.45s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3501, 'learning_rate': 8.040000000000001e-06, 'epoch': 0.53} + 53%|███████████████████████████████████████████ | 135/254 [43:22<36:35, 18.45s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|███████████████████████████████████████████ | 135/254 [43:22<36:35, 18.45s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|███████████████████████████████████████████ | 135/254 [43:22<36:35, 18.45s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|███████████████████████████████████████████ | 135/254 [43:22<36:35, 18.45s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:50:57,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:50:57,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [43:40<35:46, 18.19s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2522, 'learning_rate': 8.160000000000001e-06, 'epoch': 0.54} + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:57<34:46, 17.83s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1099, 'learning_rate': 8.220000000000001e-06, 'epoch': 0.54} + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [44:14<34:00, 17.59s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [44:30<32:48, 17.12s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▋ | 140/254 [44:45<31:22, 16.51s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:08,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:08,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:08,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:08,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:08,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:08,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:08,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 56%|████████████████████████████████████████████▉ | 141/254 [44:59<29:49, 15.84s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:22,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:22,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:22,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:22,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:31,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:31,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 56%|█████████████████████████████████████████████▎ | 142/254 [45:12<28:10, 15.09s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 56%|█████████████████████████████████████████████▎ | 142/254 [45:12<28:10, 15.09s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 56%|█████████████████████████████████████████████▎ | 142/254 [45:12<28:10, 15.09s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:39,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:39,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:39,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:45,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:45,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4225, 'learning_rate': 8.52e-06, 'epoch': 0.56} +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:49,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:49,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:49,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:55,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:55,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 57%|█████████████████████████████████████████████▉ | 144/254 [45:36<24:33, 13.39s/it]g-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:52:59,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:01,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:01,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:05,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:05,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:37:48,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 57%|██████████████████████████████████████████████▏ | 145/254 [45:46<22:33, 12.41s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 57%|██████████████████████████████████████████████▏ | 145/254 [45:46<22:33, 12.41s/it][WARNING|modeling_utils.py:388] 2022-03-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:11,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:13,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:15,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:15,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:17,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:19,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:21,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:21,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:23,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:25,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:27,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:28,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:28,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:32,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:33,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:36,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:36,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:37,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:40,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:41,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:41,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.719, 'learning_rate': 8.939999999999999e-06, 'epoch': 0.59} +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:48,381 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:48,381 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:54,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:53:54,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:54:00,532 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 16:54:00,532 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2158, 'learning_rate': 9e-06, 'epoch': 0.59} + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [46:46<21:34, 12.57s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2449, 'learning_rate': 9.06e-06, 'epoch': 0.6} + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [47:09<26:55, 15.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [47:33<30:28, 18.10s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|███████████████��█████████████████████████████████ | 154/254 [47:56<32:33, 19.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2042, 'learning_rate': 9.24e-06, 'epoch': 0.61} + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [48:19<33:51, 20.52s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2117, 'learning_rate': 9.3e-06, 'epoch': 0.61} + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|████████████████████████████████���████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [48:41<34:33, 21.16s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [49:04<34:55, 21.60s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████▍ | 158/254 [49:26<35:00, 21.88s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [49:49<34:53, 22.03s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2246, 'learning_rate': 9.54e-06, 'epoch': 0.63} + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [50:11<34:39, 22.12s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████���████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1417, 'learning_rate': 9.600000000000001e-06, 'epoch': 0.63} + 63%|███████████████████████████████████████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [50:33<34:11, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2308, 'learning_rate': 9.66e-06, 'epoch': 0.64} + 64%|███████████████████████████████████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|████████████████████████████���██████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [50:55<33:42, 21.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2547, 'learning_rate': 9.72e-06, 'epoch': 0.64} + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [51:17<33:27, 22.06s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|█████████████████████████████████████████████���██████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [51:38<32:46, 21.85s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.231, 'learning_rate': 9.84e-06, 'epoch': 0.65} + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [52:00<32:16, 21.76s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [52:21<31:39, 21.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1354, 'learning_rate': 9.960000000000001e-06, 'epoch': 0.66} + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [52:42<31:07, 21.46s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2342, 'learning_rate': 1.002e-05, 'epoch': 0.66} + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [53:03<30:36, 21.36s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2023, 'learning_rate': 1.008e-05, 'epoch': 0.66} + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [53:24<30:04, 21.23s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2011, 'learning_rate': 1.0140000000000001e-05, 'epoch': 0.67} + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|████████████████████████████████��█████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [53:45<29:29, 21.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [54:06<29:03, 21.00s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|██████████████████████████████████████████████████████▊ | 172/254 [54:26<28:28, 20.84s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████��� | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [54:47<27:53, 20.66s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [55:07<27:23, 20.54s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▊ | 175/254 [55:28<27:06, 20.59s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1602, 'learning_rate': 1.05e-05, 'epoch': 0.69} + 69%|████████████████████████████████████████████████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████��███████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [55:48<26:32, 20.41s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2084, 'learning_rate': 1.0559999999999999e-05, 'epoch': 0.69} + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [56:07<25:57, 20.22s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.211, 'learning_rate': 1.062e-05, 'epoch': 0.7} + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [56:27<25:25, 20.07s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [56:46<24:50, 19.87s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2695, 'learning_rate': 1.074e-05, 'epoch': 0.71} + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [57:06<24:18, 19.71s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [57:25<23:46, 19.55s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [57:44<23:13, 19.35s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [58:03<22:39, 19.14s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [58:03<22:39, 19.14s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1708, 'learning_rate': 1.092e-05, 'epoch': 0.72} + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [58:03<22:39, 19.14s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [58:03<22:39, 19.14s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [58:03<22:39, 19.14s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [58:03<22:39, 19.14s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:05:38,958 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2427, 'learning_rate': 1.098e-05, 'epoch': 0.72} + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [58:21<22:03, 18.91s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2921, 'learning_rate': 1.104e-05, 'epoch': 0.73} + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████��███████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [58:39<21:28, 18.67s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1975, 'learning_rate': 1.11e-05, 'epoch': 0.73} + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [58:57<20:49, 18.38s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1657, 'learning_rate': 1.116e-05, 'epoch': 0.73} + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [59:14<20:14, 18.13s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3617, 'learning_rate': 1.1220000000000001e-05, 'epoch': 0.74} + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|██████████████████��████████████████████████████████████████▉ | 188/254 [59:32<19:45, 17.96s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1648, 'learning_rate': 1.128e-05, 'epoch': 0.74} + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [59:48<18:59, 17.53s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2542, 'learning_rate': 1.134e-05, 'epoch': 0.75} + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|███████████████████████████████████████████████████████████ | 190/254 [1:00:04<18:07, 16.99s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:39,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:39,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:39,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:39,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:39,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:49,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:49,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:49,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 76%|███████████████████████████████████████████████████████████▋ | 192/254 [1:00:33<16:08, 15.62s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 76%|███████████████████████████████████████████████████████████▋ | 192/254 [1:00:33<16:08, 15.62s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 76%|███████████████████████████████████████████████████████████▋ | 192/254 [1:00:33<16:08, 15.62s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:59,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:59,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:07:59,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:05,947 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:05,947 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4031, 'learning_rate': 1.152e-05, 'epoch': 0.76} +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:05,947 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:11,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:11,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:16,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:16,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 76%|████████████████████████████████████████████████████████████▎ | 194/254 [1:00:57<13:48, 13.81s/it]g-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:20,365 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:20,365 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:24,283 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:24,283 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:28,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:28,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:30,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:30,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:34,122 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:36,401 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:36,401 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 16:53:07,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 77%|████████████████████████████████████████████████████████████▉ | 196/254 [1:01:17<11:24, 11.81s/it][WARNING|modeling_utils.py:388] 2022-03-01 17:08:38,710 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:40,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:38,710 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:42,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:38,710 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:44,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:38,710 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▎ | 197/254 [1:01:25<10:14, 10.77s/it][WARNING|modeling_utils.py:388] 2022-03-01 17:08:46,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▎ | 197/254 [1:01:25<10:14, 10.77s/it][WARNING|modeling_utils.py:388] 2022-03-01 17:08:46,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:48,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:46,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:50,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:46,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:52,344 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:46,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:52,344 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:46,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▌ | 198/254 [1:01:33<09:05, 9.74s/it][WARNING|modeling_utils.py:388] 2022-03-01 17:08:54,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:57,214 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:54,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:58,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:54,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:08:58,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:08:54,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▉ | 199/254 [1:01:39<07:56, 8.66s/it][WARNING|modeling_utils.py:388] 2022-03-01 17:09:00,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-01 17:09:02,660 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 17:09:00,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 79%|██████████████████████████████████████████████████████████████▏ | 200/254 [1:01:45<06:57, 7.73s/it]g-point operations will not be computed-01 17:09:00,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 79%|██████████████████████████████████████████████████████████████▏ | 200/254 [1:01:45<06:57, 7.73s/it]g-point operations will not be computed-01 17:09:00,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)5<06:57, 7.73s/it]Traceback (most recent call last):puted-01 17:09:00,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)5<06:57, 7.73s/it]Traceback (most recent call last):puted-01 17:09:00,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)5<06:57, 7.73s/it]Traceback (most recent call last):puted-01 17:09:00,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed \ No newline at end of file