diff --git "a/wandb/run-20220302_074637-35y19oi2/files/output.log" "b/wandb/run-20220302_074637-35y19oi2/files/output.log" new file mode 100644--- /dev/null +++ "b/wandb/run-20220302_074637-35y19oi2/files/output.log" @@ -0,0 +1,1694 @@ + + + 0%| | 0/254 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:46:45,180 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:46:48,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:46:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:46:54,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:46:57,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:00,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7871, 'learning_rate': 2.0000000000000002e-07, 'epoch': 0.0} +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:03,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 0%|▎ | 1/254 [00:25<1:45:26, 25.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:47:06,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:09,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:12,339 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:15,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:18,181 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:21,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:23,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:26,811 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.7899, 'learning_rate': 4.0000000000000003e-07, 'epoch': 0.01} + + 1%|▋ | 2/254 [00:48<1:41:18, 24.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:47:29,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:32,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:35,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:38,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:41,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:44,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:47,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9306, 'learning_rate': 6.000000000000001e-07, 'epoch': 0.01} +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:50,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 1%|▉ | 3/254 [01:11<1:39:09, 23.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:47:53,030 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:55,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:47:58,686 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:01,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:04,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:07,240 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:10,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8908, 'learning_rate': 8.000000000000001e-07, 'epoch': 0.02} +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:12,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 2%|█▎ | 4/254 [01:34<1:37:26, 23.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:48:15,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:18,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:21,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:24,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:27,099 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:29,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:32,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:35,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▌ | 5/254 [01:57<1:35:37, 23.04s/it] + + 2%|█▌ | 5/254 [01:57<1:35:37, 23.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:48:38,281 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:41,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:43,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:46,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:49,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:52,259 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:55,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:48:57,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 2%|█▉ | 6/254 [02:19<1:34:27, 22.85s/it] + + 2%|█▉ | 6/254 [02:19<1:34:27, 22.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:49:00,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:03,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:06,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:09,113 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:11,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:14,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:17,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:20,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 3%|██▏ | 7/254 [02:41<1:33:31, 22.72s/it] + + 3%|██▏ | 7/254 [02:41<1:33:31, 22.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:49:23,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:26,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:28,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:31,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:34,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:37,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:39,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.9372, 'learning_rate': 1.4000000000000001e-06, 'epoch': 0.03} +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:42,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 3%|██▌ | 8/254 [03:04<1:32:45, 22.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:49:45,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:48,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:51,195 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:53,934 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:56,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:49:59,503 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:02,263 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.844, 'learning_rate': 1.6000000000000001e-06, 'epoch': 0.04} +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:05,005 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 4%|██▊ | 9/254 [03:26<1:31:58, 22.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:50:07,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:10,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:13,435 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:16,199 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:18,970 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:21,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:24,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:27,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▏ | 10/254 [03:48<1:31:07, 22.41s/it] + + 4%|███▏ | 10/254 [03:48<1:31:07, 22.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:50:30,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:32,701 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:35,380 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:38,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:40,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:43,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:46,190 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:48,934 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 4%|███▍ | 11/254 [04:10<1:29:58, 22.22s/it] + + 4%|███▍ | 11/254 [04:10<1:29:58, 22.22s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:50:51,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:54,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:57,198 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:50:59,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:02,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:05,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:07,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.8076, 'learning_rate': 2.2e-06, 'epoch': 0.05} +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:10,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + + 5%|███▊ | 12/254 [04:32<1:29:00, 22.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:51:13,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:16,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:18,817 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:22,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:24,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:27,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:30,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:32,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 13/254 [04:54<1:28:43, 22.09s/it] + 5%|████ | 13/254 [04:54<1:28:43, 22.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 5%|████ | 13/254 [04:54<1:28:43, 22.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:41,126 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:41,126 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:46,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:46,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:51:51,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:16<1:27:42, 21.93s/it]g-point operations will not be computed-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:16<1:27:42, 21.93s/it]g-point operations will not be computed-02 07:51:35,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:16<1:27:42, 21.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▍ | 14/254 [05:16<1:27:42, 21.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:02,394 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:02,394 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:07,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:07,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:12,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:12,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 15/254 [05:37<1:26:22, 21.68s/it]g-point operations will not be computed-02 07:51:57,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 15/254 [05:37<1:26:22, 21.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|████▋ | 15/254 [05:37<1:26:22, 21.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:23,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:23,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:28,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:28,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:33,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 16/254 [05:58<1:25:09, 21.47s/it]g-point operations will not be computed-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 16/254 [05:58<1:25:09, 21.47s/it]g-point operations will not be computed-02 07:52:18,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 16/254 [05:58<1:25:09, 21.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 6%|█████ | 16/254 [05:58<1:25:09, 21.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:44,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:44,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:49,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:49,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:54,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:54,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:52:54,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 17/254 [06:19<1:24:20, 21.35s/it]g-point operations will not be computed-02 07:52:39,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 17/254 [06:19<1:24:20, 21.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▎ | 17/254 [06:19<1:24:20, 21.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:05,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:05,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:10,557 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:10,557 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:15,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:15,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:15,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 18/254 [06:40<1:23:21, 21.19s/it]g-point operations will not be computed-02 07:53:00,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 18/254 [06:40<1:23:21, 21.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▋ | 18/254 [06:40<1:23:21, 21.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:26,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:26,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:31,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:31,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:36,557 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:36,557 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 19/254 [07:00<1:22:29, 21.06s/it]g-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 19/254 [07:00<1:22:29, 21.06s/it]g-point operations will not be computed-02 07:53:21,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 19/254 [07:00<1:22:29, 21.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 7%|█████▉ | 19/254 [07:00<1:22:29, 21.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:46,933 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:46,933 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:52,021 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:52,021 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:57,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:57,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:53:57,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 20/254 [07:21<1:21:35, 20.92s/it]g-point operations will not be computed-02 07:53:41,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 20/254 [07:21<1:21:35, 20.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▎ | 20/254 [07:21<1:21:35, 20.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:07,390 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:07,390 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:12,497 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:12,497 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:17,596 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:17,596 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:17,596 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 21/254 [07:41<1:20:35, 20.75s/it]g-point operations will not be computed-02 07:54:02,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 21/254 [07:41<1:20:35, 20.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 8%|██████▌ | 21/254 [07:41<1:20:35, 20.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:27,747 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:27,747 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:32,783 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:32,783 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:37,735 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:01<1:19:35, 20.59s/it]g-point operations will not be computed-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:01<1:19:35, 20.59s/it]g-point operations will not be computed-02 07:54:22,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:01<1:19:35, 20.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|██████▉ | 22/254 [08:01<1:19:35, 20.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:47,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:47,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:52,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:52,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:54:57,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:22<1:18:39, 20.43s/it]g-point operations will not be computed-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:22<1:18:39, 20.43s/it]g-point operations will not be computed-02 07:54:42,857 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:22<1:18:39, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▏ | 23/254 [08:22<1:18:39, 20.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:55:07,919 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:55:07,919 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:55:12,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:55:12,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:55:17,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:55:17,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it]g-point operations will not be computed-02 07:55:02,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 9%|███████▌ | 24/254 [08:42<1:17:53, 20.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4866, 'learning_rate': 4.800000000000001e-06, 'epoch': 0.1} + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|███████▊ | 25/254 [09:02<1:17:19, 20.26s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 10%|████████▏ | 26/254 [09:22<1:16:32, 20.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▌ | 27/254 [09:41<1:15:25, 19.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4714, 'learning_rate': 5.4e-06, 'epoch': 0.11} + 11%|████████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|████████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█��██████▊ | 28/254 [10:00<1:14:24, 19.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3393, 'learning_rate': 5.600000000000001e-06, 'epoch': 0.11} + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 11%|█████████▏ | 29/254 [10:20<1:13:35, 19.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.5015, 'learning_rate': 5.8e-06, 'epoch': 0.12} + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▍ | 30/254 [10:39<1:12:36, 19.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 12%|█████████▊ | 31/254 [10:57<1:11:28, 19.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████ | 32/254 [11:16<1:10:33, 19.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:55:22,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:35<1:09:30, 18.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:35<1:09:30, 18.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:35<1:09:30, 18.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:35<1:09:30, 18.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:35<1:09:30, 18.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:35<1:09:30, 18.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▍ | 33/254 [11:35<1:09:30, 18.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [11:52<1:08:10, 18.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [11:52<1:08:10, 18.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4072, 'learning_rate': 6.6e-06, 'epoch': 0.13} + 13%|██████████▋ | 34/254 [11:52<1:08:10, 18.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [11:52<1:08:10, 18.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [11:52<1:08:10, 18.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [11:52<1:08:10, 18.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 13%|██████████▋ | 34/254 [11:52<1:08:10, 18.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3752, 'learning_rate': 6.800000000000001e-06, 'epoch': 0.14} +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 07:58:48,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 14%|███████████▎ | 36/254 [12:27<1:05:26, 18.01s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4767, 'learning_rate': 7.2e-06, 'epoch': 0.15} + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▋ | 37/254 [12:44<1:04:03, 17.71s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.403, 'learning_rate': 7.4e-06, 'epoch': 0.15} + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|███████████▉ | 38/254 [13:02<1:02:58, 17.49s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 15%|████████████▎ | 39/254 [13:17<1:01:01, 17.03s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|████████████▉ | 40/254 [13:33<58:46, 16.48s/it]g-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:15,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:15,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:15,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:15,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:15,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:15,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:15,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 07:58:15,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [13:47<56:13, 15.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [13:47<56:13, 15.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [13:47<56:13, 15.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [13:47<56:13, 15.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 16%|█████████████▏ | 41/254 [13:47<56:13, 15.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:37,578 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▌ | 42/254 [14:00<53:25, 15.12s/it]g-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▌ | 42/254 [14:00<53:25, 15.12s/it]g-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4575, 'learning_rate': 8.200000000000001e-06, 'epoch': 0.16} +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:43,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:43,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:43,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:50,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▉ | 43/254 [14:13<50:20, 14.31s/it]g-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|█████████████▉ | 43/254 [14:13<50:20, 14.31s/it]g-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3135, 'learning_rate': 8.400000000000001e-06, 'epoch': 0.17} +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:56,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:00:56,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:00,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:00,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:00,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:00:27,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|██████████████▏ | 44/254 [14:24<47:03, 13.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:01:04,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 17%|██████████████▏ | 44/254 [14:24<47:03, 13.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:01:04,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:08,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:04,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:08,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:04,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:12,069 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:04,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████▌ | 45/254 [14:35<43:31, 12.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 18%|██████████████▌ | 45/254 [14:35<43:31, 12.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.4926, 'learning_rate': 8.8e-06, 'epoch': 0.18} +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:18,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:20,394 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:22,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:22,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:24,778 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:26,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:28,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:30,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:30,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:32,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:34,415 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:36,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:37,827 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:37,827 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:41,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:42,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:44,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:44,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:46,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:47,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:49,566 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:49,566 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3864, 'learning_rate': 9.800000000000001e-06, 'epoch': 0.2} +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:55,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:01:55,911 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.311, 'learning_rate': 1e-05, 'epoch': 0.2} +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:02:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 20%|████████████████▊ | 52/254 [15:58<53:13, 15.81s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|████████████████▋ | 53/254 [16:21<1:00:16, 17.99s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2127, 'learning_rate': 1.06e-05, 'epoch': 0.21} + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 21%|█████████████████ | 54/254 [16:44<1:04:47, 19.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.128, 'learning_rate': 1.08e-05, 'epoch': 0.22} + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▎ | 55/254 [17:07<1:07:48, 20.45s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1871, 'learning_rate': 1.1000000000000001e-05, 'epoch': 0.22} + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▋ | 56/254 [17:30<1:09:38, 21.11s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2409, 'learning_rate': 1.1200000000000001e-05, 'epoch': 0.22} + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 22%|█████████████████▉ | 57/254 [17:52<1:10:26, 21.46s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▎ | 58/254 [18:14<1:11:06, 21.77s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 23%|██████████████████▌ | 59/254 [18:37<1:11:08, 21.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|██████████████████▉ | 60/254 [18:59<1:11:02, 21.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.278, 'learning_rate': 1.2e-05, 'epoch': 0.24} + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▏ | 61/254 [19:21<1:10:42, 21.98s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1953, 'learning_rate': 1.22e-05, 'epoch': 0.24} + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 24%|███████████████████▌ | 62/254 [19:42<1:10:09, 21.93s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|███████████████████▊ | 63/254 [20:05<1:10:09, 22.04s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 25%|████████████████████▏ | 64/254 [20:27<1:09:31, 21.95s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▍ | 65/254 [20:48<1:08:39, 21.80s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2909, 'learning_rate': 1.3000000000000001e-05, 'epoch': 0.26} + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|████████████████████▊ | 66/254 [21:09<1:07:47, 21.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.224, 'learning_rate': 1.32e-05, 'epoch': 0.26} + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 26%|█████████████████████ | 67/254 [21:30<1:06:57, 21.48s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2762, 'learning_rate': 1.3400000000000002e-05, 'epoch': 0.27} + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▍ | 68/254 [21:51<1:06:05, 21.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 27%|█████████████████████▋ | 69/254 [22:12<1:05:18, 21.18s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2109, 'learning_rate': 1.4000000000000001e-05, 'epoch': 0.28} + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████ | 70/254 [22:33<1:04:25, 21.01s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2057, 'learning_rate': 1.42e-05, 'epoch': 0.28} + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 28%|██████████████████████▋ | 72/254 [23:14<1:02:45, 20.69s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████���██████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|██████████████████████▉ | 73/254 [23:34<1:01:57, 20.54s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 29%|███████████████████████▎ | 74/254 [23:54<1:01:20, 20.44s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▌ | 75/254 [24:14<1:00:55, 20.42s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2807, 'learning_rate': 1.5e-05, 'epoch': 0.3} + 30%|███████████████████████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████��███████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|███████████████████████▉ | 76/254 [24:34<1:00:01, 20.23s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2235, 'learning_rate': 1.52e-05, 'epoch': 0.3} + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 30%|████████████████████████▊ | 77/254 [24:54<59:08, 20.05s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2436, 'learning_rate': 1.54e-05, 'epoch': 0.31} + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▏ | 78/254 [25:13<58:18, 19.88s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1341, 'learning_rate': 1.56e-05, 'epoch': 0.31} + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▌ | 79/254 [25:32<57:15, 19.63s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2765, 'learning_rate': 1.58e-05, 'epoch': 0.31} + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 31%|█████████████████████████▊ | 80/254 [25:51<56:11, 19.38s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▏ | 81/254 [26:10<55:19, 19.19s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 32%|██████████████████████████▍ | 82/254 [26:28<54:22, 18.97s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|██████████████████████████▊ | 83/254 [26:47<53:28, 18.76s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2277, 'learning_rate': 1.66e-05, 'epoch': 0.33} + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████ | 84/254 [27:04<52:23, 18.49s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3309, 'learning_rate': 1.6800000000000002e-05, 'epoch': 0.33} + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 33%|███████████████████████████▍ | 85/254 [27:22<51:16, 18.20s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1866, 'learning_rate': 1.7000000000000003e-05, 'epoch': 0.34} + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|███████████████████████████▊ | 86/254 [27:39<50:04, 17.89s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2429, 'learning_rate': 1.7199999999999998e-05, 'epoch': 0.34} + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 34%|████████████████████████████ | 87/254 [27:56<48:42, 17.50s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2578, 'learning_rate': 1.74e-05, 'epoch': 0.35} + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▍ | 88/254 [28:13<47:55, 17.32s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|████████████████████████████▋ | 89/254 [28:28<46:18, 16.84s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 35%|█████████████████████████████ | 90/254 [28:43<44:32, 16.29s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:25,765 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:25,765 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:25,765 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:25,765 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:25,765 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:36,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:36,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2666, 'learning_rate': 1.8e-05, 'epoch': 0.36} +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:36,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:36,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:44,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:44,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:44,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:44,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 36%|█████████████████████████████▋ | 92/254 [29:10<40:06, 14.85s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:52,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:52,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:52,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:58,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:58,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:15:58,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 37%|██████████████████████████████ | 93/254 [29:23<37:38, 14.03s/it]g-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:04,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:04,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:08,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:08,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:12,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:12,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3208, 'learning_rate': 1.86e-05, 'epoch': 0.37} +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:16,435 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:18,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:18,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:22,555 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:22,555 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:24,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:24,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:28,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:30,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:30,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:01:14,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|██████████████████████████████▉ | 96/254 [29:53<29:35, 11.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:16:32,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:34,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:32,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:36,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:32,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:38,433 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:32,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:38,433 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:32,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 38%|███████████████████████████████▎ | 97/254 [30:01<26:47, 10.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:16:40,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:42,224 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:40,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:43,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:40,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:43,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:40,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▋ | 98/254 [30:08<24:05, 9.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:16:47,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:48,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:47,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:50,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:47,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:50,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:47,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▉ | 99/254 [30:14<21:21, 8.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:16:53,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:55,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:53,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:56,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:53,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:16:56,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:16:53,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▉ | 100/254 [30:19<19:07, 7.45s/it]g-point operations will not be computed-02 08:16:53,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▉ | 100/254 [30:19<19:07, 7.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 39%|███████████████████████████████▉ | 100/254 [30:19<19:07, 7.45s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:07,222 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:07,222 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2785, 'learning_rate': 2e-05, 'epoch': 0.4} +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:17:13,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1837, 'learning_rate': 2.0200000000000003e-05, 'epoch': 0.4} + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 40%|████████████████████████████████▌ | 102/254 [31:06<39:44, 15.69s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|████████████████████████████████▊ | 103/254 [31:29<44:51, 17.83s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████��███▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2698, 'learning_rate': 2.08e-05, 'epoch': 0.41} + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 41%|█████████████████████████████████▏ | 104/254 [31:52<48:10, 19.27s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████████▊ | 106/254 [32:37<51:35, 20.92s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|██████████████████████████████████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|██████████████████████████████████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1013, 'learning_rate': 2.12e-05, 'epoch': 0.42} + 42%|██████████████████████████████████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|██████████████████████████████████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|██████████████████████████████████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|██████████████████████████████████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|█████████████████████████████���████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 42%|██████████████████████████████████ | 107/254 [32:59<52:16, 21.34s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2109, 'learning_rate': 2.1400000000000002e-05, 'epoch': 0.42} + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▍ | 108/254 [33:21<52:37, 21.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2275, 'learning_rate': 2.16e-05, 'epoch': 0.43} + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|██████████████████████████████████▊ | 109/254 [33:44<52:34, 21.76s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2312, 'learning_rate': 2.18e-05, 'epoch': 0.43} + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 43%|███████████████████████████████████ | 110/254 [34:06<52:25, 21.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▍ | 111/254 [34:27<51:50, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|███████████████████████████████████▋ | 112/254 [34:49<51:28, 21.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 44%|████████████████████████████████████ | 113/254 [35:11<51:25, 21.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2024, 'learning_rate': 2.26e-05, 'epoch': 0.45} + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▎ | 114/254 [35:33<50:54, 21.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 45%|████████████████████████████████████▋ | 115/254 [35:54<50:10, 21.66s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1406, 'learning_rate': 2.3000000000000003e-05, 'epoch': 0.46} + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████████████████████████████▉ | 116/254 [36:15<49:28, 21.51s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2267, 'learning_rate': 2.32e-05, 'epoch': 0.46} + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|████████████��████████████████████████▎ | 117/254 [36:36<48:52, 21.40s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1685, 'learning_rate': 2.36e-05, 'epoch': 0.47} + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████���███▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2016, 'learning_rate': 2.38e-05, 'epoch': 0.47} + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 46%|█████████████████████████████████████▋ | 118/254 [36:57<48:05, 21.22s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1813, 'learning_rate': 2.4e-05, 'epoch': 0.47} + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▌ | 121/254 [37:59<46:08, 20.82s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.251, 'learning_rate': 2.4200000000000002e-05, 'epoch': 0.48} + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|██████████████████████████████████████▉ | 122/254 [38:19<45:23, 20.63s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1699, 'learning_rate': 2.44e-05, 'epoch': 0.48} + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 48%|███████████████████████████████████████▏ | 123/254 [38:39<44:39, 20.45s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▌ | 124/254 [38:59<43:59, 20.31s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 49%|███████████████████████████████████████▊ | 125/254 [39:20<43:40, 20.32s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1442, 'learning_rate': 2.5e-05, 'epoch': 0.49} + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▏ | 126/254 [39:39<42:52, 20.10s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▌ | 127/254 [39:59<42:04, 19.88s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1569, 'learning_rate': 2.54e-05, 'epoch': 0.5} + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 50%|████████████████████████████████████████▊ | 128/254 [40:18<41:23, 19.71s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2225, 'learning_rate': 2.5600000000000002e-05, 'epoch': 0.51} + 51%|█████████████████████████████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|███████████████████████���█████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▏ | 129/254 [40:37<40:37, 19.50s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2638, 'learning_rate': 2.58e-05, 'epoch': 0.51} + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 51%|█████████████████████████████████████████▍ | 130/254 [40:56<39:57, 19.33s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2403, 'learning_rate': 2.6000000000000002e-05, 'epoch': 0.51} + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|█████████████████████████████████████████▊ | 131/254 [41:14<39:11, 19.12s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1364, 'learning_rate': 2.6200000000000003e-05, 'epoch': 0.52} + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████ | 132/254 [41:33<38:27, 18.91s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [41:51<37:44, 18.72s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [41:51<37:44, 18.72s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1834, 'learning_rate': 2.64e-05, 'epoch': 0.52} + 52%|██████████████████████████████████████████▍ | 133/254 [41:51<37:44, 18.72s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 52%|██████████████████████████████████████████▍ | 133/254 [41:51<37:44, 18.72s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:28:40,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:28:40,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:28:40,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0975, 'learning_rate': 2.6600000000000003e-05, 'epoch': 0.53} + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|██████████████████████████████████████████▋ | 134/254 [42:09<36:47, 18.39s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|███████████████████████████████████████████ | 135/254 [42:26<35:59, 18.14s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 53%|███████████████████████████████████████████ | 135/254 [42:26<35:59, 18.14s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.296, 'learning_rate': 2.6800000000000004e-05, 'epoch': 0.53} +[WARNING|modeling_utils.py:388] 2022-03-02 08:29:11,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:29:11,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:29:11,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:29:11,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:29:11,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2294, 'learning_rate': 2.7000000000000002e-05, 'epoch': 0.53} + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▎ | 136/254 [42:43<35:04, 17.84s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1809, 'learning_rate': 2.7200000000000004e-05, 'epoch': 0.54} + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|███████████████████████████████████████████▋ | 137/254 [43:00<34:04, 17.47s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.056, 'learning_rate': 2.7400000000000002e-05, 'epoch': 0.54} + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 54%|████████████████████████████████████████████ | 138/254 [43:17<33:20, 17.24s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [43:32<32:05, 16.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [43:32<32:05, 16.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [43:32<32:05, 16.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [43:32<32:05, 16.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▎ | 139/254 [43:32<32:05, 16.75s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:22,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:22,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:22,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▋ | 140/254 [43:47<30:40, 16.15s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▋ | 140/254 [43:47<30:40, 16.15s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▋ | 140/254 [43:47<30:40, 16.15s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▋ | 140/254 [43:47<30:40, 16.15s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 55%|████████████████████████████████████████████▋ | 140/254 [43:47<30:40, 16.15s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:36,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:36,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:36,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 56%|████████████████████████████████████████████▉ | 141/254 [44:01<29:10, 15.49s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 56%|████████████████████████████████████████████▉ | 141/254 [44:01<29:10, 15.49s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 56%|████████████████████████████████████████████▉ | 141/254 [44:01<29:10, 15.49s/it]g-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:46,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:46,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:46,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:52,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:52,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2078, 'learning_rate': 2.8199999999999998e-05, 'epoch': 0.56} +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:52,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:30:52,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:00,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:00,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:04,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:04,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3606, 'learning_rate': 2.84e-05, 'epoch': 0.56} +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:04,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:10,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:10,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:14,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:14,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:17:01,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 57%|█████████████████████████████████████████████▉ | 144/254 [44:37<24:00, 13.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:31:17,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 57%|█████████████████████████████████████████████▉ | 144/254 [44:37<24:00, 13.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:31:17,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:21,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:17,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:21,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:17,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:24,810 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:17,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:24,810 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:17,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 57%|██████████████████████████████████████████████▏ | 145/254 [44:47<22:02, 12.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:29,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:31,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:33,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:33,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 57%|██████████████████████████████████████████████▌ | 146/254 [44:56<20:02, 11.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:36,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:38,870 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:40,823 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:40,823 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:42,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:44,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:46,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:49,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:49,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:51,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:52,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:55,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:55,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:56,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:59,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:31:59,290 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:00,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:00,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:07,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:07,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:13,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:13,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:19,448 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:19,448 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:32:19,448 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 59%|████████████████████████████████████████████████▏ | 151/254 [45:46<21:23, 12.46s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2034, 'learning_rate': 3.02e-05, 'epoch': 0.6} + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▍ | 152/254 [46:10<26:38, 15.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████��███▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 60%|████████████████████████████████████████████████▊ | 153/254 [46:33<30:03, 17.85s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████ | 154/254 [46:55<32:09, 19.29s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▍ | 155/254 [47:18<33:23, 20.24s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|████████████████████████████████████████���████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 61%|█████████████████████████████████████████████████▋ | 156/254 [47:40<34:06, 20.88s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1791, 'learning_rate': 3.1400000000000004e-05, 'epoch': 0.62} + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 62%|██████████████████████████████████████████████████ | 157/254 [48:02<34:23, 21.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████████▋ | 159/254 [48:46<34:15, 21.63s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|██████████████████████████████████████████████���████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████ | 160/254 [49:08<33:57, 21.67s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 63%|███████████████████████████████████████████████████▎ | 161/254 [49:29<33:29, 21.61s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1827, 'learning_rate': 3.2200000000000003e-05, 'epoch': 0.64} + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▋ | 162/254 [49:51<33:02, 21.55s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2159, 'learning_rate': 3.24e-05, 'epoch': 0.64} + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 64%|███████████████████████████████████████████████████▉ | 163/254 [50:13<32:52, 21.68s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1203, 'learning_rate': 3.26e-05, 'epoch': 0.64} + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▎ | 164/254 [50:34<32:16, 21.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▌ | 165/254 [50:55<31:42, 21.38s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 65%|████████████████████████████████████████████████████▉ | 166/254 [51:16<31:07, 21.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▎ | 167/254 [51:37<30:33, 21.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 66%|█████████████████████████████████████████████████████▌ | 168/254 [51:57<29:58, 20.91s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1565, 'learning_rate': 3.3600000000000004e-05, 'epoch': 0.66} + 67%|█████████████████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|█████████████████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|███████████���█████████████████████████████████████████▉ | 169/254 [52:18<29:26, 20.78s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1505, 'learning_rate': 3.38e-05, 'epoch': 0.67} + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▏ | 170/254 [52:38<28:54, 20.65s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0987, 'learning_rate': 3.4000000000000007e-05, 'epoch': 0.67} + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0598, 'learning_rate': 3.4200000000000005e-05, 'epoch': 0.67} + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 67%|██████████████████████████████████████████████████████▌ | 171/254 [52:58<28:23, 20.52s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 68%|███████████████████████████████████████████████████████▏ | 173/254 [53:38<27:17, 20.22s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1869, 'learning_rate': 3.48e-05, 'epoch': 0.69} + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|███████████████████████████████████████████████████████▍ | 174/254 [53:58<26:49, 20.11s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1173, 'learning_rate': 3.5e-05, 'epoch': 0.69} + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 69%|████████████████████████████████████████████████████████▏ | 176/254 [54:38<25:57, 19.96s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1547, 'learning_rate': 3.52e-05, 'epoch': 0.69} + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▍ | 177/254 [54:57<25:22, 19.77s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|████████████████████████████████████████████████████████▊ | 178/254 [55:16<24:49, 19.60s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 70%|█████████████████████████████████████████████████████████ | 179/254 [55:35<24:16, 19.43s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2217, 'learning_rate': 3.58e-05, 'epoch': 0.71} + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▍ | 180/254 [55:54<23:45, 19.27s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.2581, 'learning_rate': 3.6e-05, 'epoch': 0.71} + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 71%|█████████████████████████████████████████████████████████▋ | 181/254 [56:13<23:16, 19.13s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.0438, 'learning_rate': 3.62e-05, 'epoch': 0.71} + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████ | 182/254 [56:32<22:44, 18.95s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.1206, 'learning_rate': 3.6400000000000004e-05, 'epoch': 0.72} + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▎ | 183/254 [56:50<22:07, 18.69s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████████████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 72%|██████████████████████████��███████████████████████████████▋ | 184/254 [57:08<21:35, 18.50s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|██████████████████████████████████████████████████████████▉ | 185/254 [57:25<21:00, 18.26s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:08,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:08,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:08,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:08,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:08,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:08,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:08,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|█████████��█████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 73%|███████████████████████████████████████████████████████████▎ | 186/254 [57:43<20:22, 17.97s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [58:00<19:42, 17.66s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [58:00<19:42, 17.66s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [58:00<19:42, 17.66s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▋ | 187/254 [58:00<19:42, 17.66s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:49,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:49,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:44:49,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.3035, 'learning_rate': 3.74e-05, 'epoch': 0.74} + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|███████████████████████████████████████████████████████████▉ | 188/254 [58:17<19:14, 17.49s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.114, 'learning_rate': 3.76e-05, 'epoch': 0.74} + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 74%|████████████████████████████████████████████████████████████▎ | 189/254 [58:33<18:29, 17.07s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|██████████████████████��█████████████████████████████████████▌ | 190/254 [58:48<17:41, 16.58s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|████████████████████████████████████████████████████████████▌ | 190/254 [58:48<17:41, 16.58s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.193, 'learning_rate': 3.7800000000000004e-05, 'epoch': 0.75} + 75%|████████████████████████████████████████████████████████████▌ | 190/254 [58:48<17:41, 16.58s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:34,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:34,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:34,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:34,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:34,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|████████████████████████████████████████████████████████████▉ | 191/254 [59:03<16:48, 16.00s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|████████████████████████████████████████████████████████████▉ | 191/254 [59:03<16:48, 16.00s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 75%|████████████████████████████████████████████████████████████▉ | 191/254 [59:03<16:48, 16.00s/it]g-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:48,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:48,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:48,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:55,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:55,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +{'loss': 4.281, 'learning_rate': 3.82e-05, 'epoch': 0.75} +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:55,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:45:55,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:03,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:03,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:03,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:03,068 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:31:27,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 76%|█████████████████████████████████████████████████████████████▌ | 193/254 [59:29<14:38, 14.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 76%|█████████████████████████████████████████████████████████████▌ | 193/254 [59:29<14:38, 14.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 76%|█████████████████████████████████████████████████████████████▌ | 193/254 [59:29<14:38, 14.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:14,803 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:14,803 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:14,803 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:18,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:18,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:22,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:25,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:25,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:25,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:29,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:31,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:33,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:33,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:37,214 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:37,214 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:09,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 77%|████████████████████████████████████████████████████████████▉ | 196/254 [1:00:00<11:08, 11.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:46:39,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:41,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:39,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:43,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:39,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:45,600 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:39,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:45,600 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:39,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▎ | 197/254 [1:00:08<10:00, 10.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 08:46:47,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:49,455 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:47,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:51,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:47,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▌ | 198/254 [1:00:15<08:53, 9.53s/it]g-point operations will not be computed-02 08:46:47,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████��███████████████████████████████████████████▌ | 198/254 [1:00:15<08:53, 9.53s/it]g-point operations will not be computed-02 08:46:47,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:56,164 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:54,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:46:57,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:46:54,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▉ | 199/254 [1:00:21<07:45, 8.47s/it]g-point operations will not be computed-02 08:46:54,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 78%|█████████████████████████████████████████████████████████████▉ | 199/254 [1:00:21<07:45, 8.47s/it]g-point operations will not be computed-02 08:46:54,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:47:01,777 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:47:00,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed +[WARNING|modeling_utils.py:388] 2022-03-02 08:47:04,098 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 08:47:00,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 79%|██████████████████████████████████████████████████████████████▏ | 200/254 [1:00:26<06:48, 7.57s/it]g-point operations will not be computed-02 08:47:00,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + 79%|██████████████████████████████████████████████████████████████▏ | 200/254 [1:00:26<06:48, 7.57s/it]g-point operations will not be computed-02 08:47:00,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)6<06:48, 7.57s/it]Traceback (most recent call last):puted-02 08:47:00,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)6<06:48, 7.57s/it]Traceback (most recent call last):puted-02 08:47:00,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed + return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)6<06:48, 7.57s/it]Traceback (most recent call last):puted-02 08:47:00,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed \ No newline at end of file