0%| | 0/509 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:13,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:16,126 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8074, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 22:26:19,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▏ | 1/509 [00:12<1:48:51, 12.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:26:22,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:25,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:28,129 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:31,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 2/509 [00:25<1:45:28, 12.48s/it] 0%|▎ | 2/509 [00:25<1:45:28, 12.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:26:34,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:37,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:40,301 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8886, 'learning_rate': 1.2e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 22:26:43,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▍ | 3/509 [00:37<1:43:14, 12.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:26:46,240 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:49,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:26:52,127 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8226, 'learning_rate': 1.8e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 22:26:55,048 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▋ | 4/509 [00:48<1:41:38, 12.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:26:58,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:00,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:03,888 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7892, 'learning_rate': 2.4e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 22:27:06,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 5/509 [01:00<1:40:13, 11.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:27:09,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:12,657 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:15,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8227, 'learning_rate': 2.9999999999999997e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 22:27:18,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▉ | 6/509 [01:12<1:39:22, 11.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:27:21,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:24,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:27,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7757, 'learning_rate': 3.6e-06, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 22:27:29,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 7/509 [01:23<1:38:10, 11.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:27:32,853 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:35,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:38,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:41,342 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 8/509 [01:35<1:37:09, 11.64s/it] 2%|█▎ | 8/509 [01:35<1:37:09, 11.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:27:44,291 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:47,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:49,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7576, 'learning_rate': 4.8e-06, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-02 22:27:52,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▍ | 9/509 [01:46<1:36:29, 11.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:27:55,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:27:58,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:01,422 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:04,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 10/509 [01:58<1:35:52, 11.53s/it] 2%|█▌ | 10/509 [01:58<1:35:52, 11.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:28:07,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:09,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:12,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:15,406 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 11/509 [02:09<1:34:50, 11.43s/it] 2%|█▋ | 11/509 [02:09<1:34:50, 11.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:28:18,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:21,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:23,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:26,530 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 12/509 [02:20<1:33:53, 11.33s/it] 2%|█▉ | 12/509 [02:20<1:33:53, 11.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:28:29,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:32,135 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:34,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:37,523 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██ | 13/509 [02:31<1:32:50, 11.23s/it] 3%|██ | 13/509 [02:31<1:32:50, 11.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:28:40,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:43,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:45,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5102, 'learning_rate': 7.799999999999998e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 22:28:48,501 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 14/509 [02:42<1:32:01, 11.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:28:51,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:54,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:28:56,784 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.416, 'learning_rate': 8.4e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 22:28:59,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 15/509 [02:53<1:31:16, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:29:02,174 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:04,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:07,492 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5563, 'learning_rate': 8.999999999999999e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 22:29:10,108 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 16/509 [03:03<1:30:05, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:29:12,847 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:15,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:18,184 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4879, 'learning_rate': 9.6e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 22:29:20,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 17/509 [03:14<1:29:17, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:29:23,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:26,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:28,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:31,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 18/509 [03:25<1:28:33, 10.82s/it] 4%|██▊ | 18/509 [03:25<1:28:33, 10.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:29:34,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:36,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:39,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:42,064 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.351, 'learning_rate': 1.0799999999999998e-05, 'epoch': 0.04} 4%|██▉ | 19/509 [03:35<1:27:46, 10.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:29:44,783 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:47,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:49,934 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:52,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 20/509 [03:46<1:26:55, 10.67s/it] 4%|███▏ | 20/509 [03:46<1:26:55, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:29:55,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:29:57,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:00,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4766, 'learning_rate': 1.1999999999999999e-05, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 22:30:03,054 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 21/509 [03:56<1:26:23, 10.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:30:05,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:08,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:10,841 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:13,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.422, 'learning_rate': 1.26e-05, 'epoch': 0.04} 4%|███▍ | 22/509 [04:07<1:25:29, 10.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:30:15,986 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:18,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:21,059 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4017, 'learning_rate': 1.3199999999999997e-05, 'epoch': 0.05} [WARNING|modeling_utils.py:388] 2022-03-02 22:30:23,596 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▌ | 23/509 [04:17<1:24:32, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:30:26,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:28,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:31,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.473, 'learning_rate': 1.3799999999999998e-05, 'epoch': 0.05} [WARNING|modeling_utils.py:388] 2022-03-02 22:30:33,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 24/509 [04:27<1:24:11, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:30:36,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:39,075 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:41,547 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:44,578 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 25/509 [04:38<1:24:30, 10.48s/it] 5%|███▉ | 25/509 [04:38<1:24:30, 10.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:30:47,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:49,752 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:30:47,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:52,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:30:47,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:52,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:30:47,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 26/509 [04:48<1:23:28, 10.37s/it]g-point operations will not be computed-02 22:30:47,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 26/509 [04:48<1:23:28, 10.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:30:57,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:30:59,762 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:30:57,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:02,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:30:57,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 27/509 [04:58<1:22:16, 10.24s/it]g-point operations will not be computed-02 22:30:57,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 27/509 [04:58<1:22:16, 10.24s/it]g-point operations will not be computed-02 22:30:57,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 27/509 [04:58<1:22:16, 10.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:31:07,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:09,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:07,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:12,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:07,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 28/509 [05:08<1:21:11, 10.13s/it]g-point operations will not be computed-02 22:31:07,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 28/509 [05:08<1:21:11, 10.13s/it]g-point operations will not be computed-02 22:31:07,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 28/509 [05:08<1:21:11, 10.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:31:17,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:19,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:17,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:21,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:17,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 29/509 [05:18<1:20:15, 10.03s/it]g-point operations will not be computed-02 22:31:17,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 29/509 [05:18<1:20:15, 10.03s/it]g-point operations will not be computed-02 22:31:17,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 29/509 [05:18<1:20:15, 10.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:31:26,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:29,286 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:26,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:31,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:26,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:31,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:26,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 30/509 [05:27<1:19:29, 9.96s/it]g-point operations will not be computed-02 22:31:26,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 30/509 [05:27<1:19:29, 9.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:31:36,604 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:38,947 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:36,604 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:41,304 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:36,604 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 31/509 [05:37<1:18:24, 9.84s/it]g-point operations will not be computed-02 22:31:36,604 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 31/509 [05:37<1:18:24, 9.84s/it]g-point operations will not be computed-02 22:31:36,604 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 31/509 [05:37<1:18:24, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:31:46,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:48,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:46,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:50,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:46,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:50,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:46,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 32/509 [05:46<1:17:25, 9.74s/it]g-point operations will not be computed-02 22:31:46,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 32/509 [05:46<1:17:25, 9.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:31:55,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:31:57,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:55,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:00,193 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:55,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:00,193 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:31:55,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 33/509 [05:56<1:16:22, 9.63s/it]g-point operations will not be computed-02 22:31:55,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 33/509 [05:56<1:16:22, 9.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:32:04,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:07,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:04,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:09,495 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:04,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:09,495 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:04,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 34/509 [06:05<1:15:15, 9.51s/it]g-point operations will not be computed-02 22:32:04,901 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 34/509 [06:05<1:15:15, 9.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:32:14,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:16,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:14,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:18,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:14,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:20,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:14,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:20,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:14,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 35/509 [06:14<1:13:53, 9.35s/it]g-point operations will not be computed-02 22:32:14,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 35/509 [06:14<1:13:53, 9.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:32:23,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:25,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:23,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:27,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:23,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:27,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:23,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 36/509 [06:23<1:12:26, 9.19s/it]g-point operations will not be computed-02 22:32:23,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 36/509 [06:23<1:12:26, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:32:31,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:33,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:31,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:36,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:31,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 37/509 [06:31<1:10:49, 9.00s/it]g-point operations will not be computed-02 22:32:31,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 37/509 [06:31<1:10:49, 9.00s/it]g-point operations will not be computed-02 22:32:31,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 37/509 [06:31<1:10:49, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:32:40,338 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:42,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:40,338 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:44,565 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:40,338 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 38/509 [06:40<1:09:30, 8.85s/it]g-point operations will not be computed-02 22:32:40,338 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 38/509 [06:40<1:09:30, 8.85s/it]g-point operations will not be computed-02 22:32:40,338 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 38/509 [06:40<1:09:30, 8.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:32:48,769 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:50,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:48,769 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:52,769 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:48,769 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 39/509 [06:48<1:07:38, 8.63s/it]g-point operations will not be computed-02 22:32:48,769 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 39/509 [06:48<1:07:38, 8.63s/it]g-point operations will not be computed-02 22:32:48,769 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 39/509 [06:48<1:07:38, 8.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:32:56,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:32:58,752 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:56,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:00,685 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:56,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:00,685 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:32:56,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 40/509 [06:56<1:05:24, 8.37s/it]g-point operations will not be computed-02 22:32:56,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 40/509 [06:56<1:05:24, 8.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:04,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:06,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:04,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:08,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:04,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:08,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:04,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 41/509 [07:03<1:02:50, 8.06s/it]g-point operations will not be computed-02 22:33:04,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:13,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:11,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:15,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:11,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:15,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:11,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 42/509 [07:10<1:00:11, 7.73s/it]g-point operations will not be computed-02 22:33:11,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 42/509 [07:10<1:00:11, 7.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:18,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:20,196 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:18,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:20,196 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:18,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:23,330 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:18,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:23,330 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:18,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 43/509 [07:17<57:13, 7.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:24,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 43/509 [07:17<57:13, 7.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:24,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 43/509 [07:17<57:13, 7.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:24,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 44/509 [07:22<53:35, 6.92s/it]g-point operations will not be computed-02 22:33:24,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 44/509 [07:22<53:35, 6.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:30,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:33,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:30,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:33,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:30,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 45/509 [07:28<49:44, 6.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:35,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 45/509 [07:28<49:44, 6.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:35,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 46/509 [07:33<45:45, 5.93s/it]g-point operations will not be computed-02 22:33:35,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 46/509 [07:33<45:45, 5.93s/it]g-point operations will not be computed-02 22:33:35,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 46/509 [07:33<45:45, 5.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:40,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 47/509 [07:37<41:36, 5.40s/it]g-point operations will not be computed-02 22:33:40,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 47/509 [07:37<41:36, 5.40s/it]g-point operations will not be computed-02 22:33:40,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 47/509 [07:37<41:36, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:44,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 47/509 [07:37<41:36, 5.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:44,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▋ | 48/509 [07:40<37:31, 4.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:48,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:49,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:48,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:33:49,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:48,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 49/509 [07:44<33:35, 4.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:51,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 50/509 [07:47<30:39, 4.01s/it]g-point operations will not be computed-02 22:33:51,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 50/509 [07:47<30:39, 4.01s/it]g-point operations will not be computed-02 22:33:51,140 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 50/509 [07:47<30:39, 4.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 50/509 [07:47<30:39, 4.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:33:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:02,826 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:02,826 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:33:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 51/509 [07:59<49:46, 6.52s/it]g-point operations will not be computed-02 22:33:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 51/509 [07:59<49:46, 6.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:08,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 51/509 [07:59<49:46, 6.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:08,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:14,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:34:08,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:14,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:34:08,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 52/509 [08:11<1:01:54, 8.13s/it]g-point operations will not be computed-02 22:34:08,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 52/509 [08:11<1:01:54, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:20,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 52/509 [08:11<1:01:54, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:20,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:26,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:34:20,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:26,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:34:20,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 53/509 [08:23<1:10:11, 9.24s/it]g-point operations will not be computed-02 22:34:20,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 53/509 [08:23<1:10:11, 9.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:32,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 53/509 [08:23<1:10:11, 9.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:32,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:38,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:34:32,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:35<1:15:44, 9.99s/it]g-point operations will not be computed-02 22:34:32,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:35<1:15:44, 9.99s/it]g-point operations will not be computed-02 22:34:32,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:35<1:15:44, 9.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:44,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 54/509 [08:35<1:15:44, 9.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:44,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:34:49,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:34:44,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:46<1:18:50, 10.42s/it]g-point operations will not be computed-02 22:34:44,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:46<1:18:50, 10.42s/it]g-point operations will not be computed-02 22:34:44,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:46<1:18:50, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:55,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 55/509 [08:46<1:18:50, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:34:55,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:01,257 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:34:55,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 56/509 [08:57<1:20:43, 10.69s/it]g-point operations will not be computed-02 22:34:55,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 56/509 [08:57<1:20:43, 10.69s/it]g-point operations will not be computed-02 22:34:55,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 56/509 [08:57<1:20:43, 10.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:06,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 56/509 [08:57<1:20:43, 10.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:06,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:12,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:06,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:12,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:06,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 57/509 [09:09<1:21:46, 10.85s/it]g-point operations will not be computed-02 22:35:06,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 57/509 [09:09<1:21:46, 10.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:18,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 57/509 [09:09<1:21:46, 10.85s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:18,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:23,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:18,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:23,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:18,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 58/509 [09:20<1:22:37, 10.99s/it]g-point operations will not be computed-02 22:35:18,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 58/509 [09:20<1:22:37, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:29,485 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 58/509 [09:20<1:22:37, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:29,485 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:34,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:29,485 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:34,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:29,485 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 59/509 [09:31<1:22:45, 11.03s/it]g-point operations will not be computed-02 22:35:29,485 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 59/509 [09:31<1:22:45, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:40,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 59/509 [09:31<1:22:45, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:40,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:40,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:46,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:40,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 60/509 [09:42<1:22:42, 11.05s/it]g-point operations will not be computed-02 22:35:40,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 60/509 [09:42<1:22:42, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:51,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 60/509 [09:42<1:22:42, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:35:51,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:35:56,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:35:51,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 61/509 [09:53<1:22:10, 11.01s/it]g-point operations will not be computed-02 22:35:51,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 61/509 [09:53<1:22:10, 11.01s/it]g-point operations will not be computed-02 22:35:51,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 61/509 [09:53<1:22:10, 11.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:02,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 61/509 [09:53<1:22:10, 11.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:02,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:36:07,842 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:02,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:36:02,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:36:02,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 62/509 [10:04<1:21:26, 10.93s/it]g-point operations will not be computed-02 22:36:02,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 62/509 [10:04<1:21:26, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:13,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 62/509 [10:04<1:21:26, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:13,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:36:18,570 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:13,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:36:18,570 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:13,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 63/509 [10:15<1:21:02, 10.90s/it]g-point operations will not be computed-02 22:36:13,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 63/509 [10:15<1:21:02, 10.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:24,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 63/509 [10:15<1:21:02, 10.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:24,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:36:29,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:24,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:36:29,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:24,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 64/509 [10:25<1:20:24, 10.84s/it]g-point operations will not be computed-02 22:36:24,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 64/509 [10:25<1:20:24, 10.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:34,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 64/509 [10:25<1:20:24, 10.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:34,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:36:40,069 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:34,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:36<1:19:49, 10.79s/it]g-point operations will not be computed-02 22:36:34,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:36<1:19:49, 10.79s/it]g-point operations will not be computed-02 22:36:34,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:36<1:19:49, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:45,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 65/509 [10:36<1:19:49, 10.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:45,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:36:50,591 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:45,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:46<1:18:59, 10.70s/it]g-point operations will not be computed-02 22:36:45,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:46<1:18:59, 10.70s/it]g-point operations will not be computed-02 22:36:45,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:46<1:18:59, 10.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:55,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 66/509 [10:46<1:18:59, 10.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:36:55,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:01,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:55,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:01,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:55,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:01,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:36:55,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 67/509 [10:57<1:18:38, 10.67s/it]g-point operations will not be computed-02 22:36:55,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 67/509 [10:57<1:18:38, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:06,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 67/509 [10:57<1:18:38, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:06,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:11,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:06,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:11,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:06,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 68/509 [11:07<1:17:55, 10.60s/it]g-point operations will not be computed-02 22:37:06,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 68/509 [11:07<1:17:55, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:16,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 68/509 [11:07<1:17:55, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:16,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:22,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:16,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:18<1:17:41, 10.59s/it]g-point operations will not be computed-02 22:37:16,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:18<1:17:41, 10.59s/it]g-point operations will not be computed-02 22:37:16,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:18<1:17:41, 10.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:27,462 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▊ | 69/509 [11:18<1:17:41, 10.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:27,462 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:32,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:27,462 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:32,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:27,462 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 70/509 [11:29<1:17:15, 10.56s/it]g-point operations will not be computed-02 22:37:27,462 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 70/509 [11:29<1:17:15, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:37,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 70/509 [11:29<1:17:15, 10.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:37,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:43,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:37,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:39<1:16:55, 10.54s/it]g-point operations will not be computed-02 22:37:37,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:39<1:16:55, 10.54s/it]g-point operations will not be computed-02 22:37:37,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:39<1:16:55, 10.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:48,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 71/509 [11:39<1:16:55, 10.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:48,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:37:53,546 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:48,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:49<1:16:25, 10.49s/it]g-point operations will not be computed-02 22:37:48,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:49<1:16:25, 10.49s/it]g-point operations will not be computed-02 22:37:48,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:49<1:16:25, 10.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:58,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 72/509 [11:49<1:16:25, 10.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:37:58,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:38:03,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:58,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:38:03,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:37:58,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [12:00<1:15:50, 10.44s/it]g-point operations will not be computed-02 22:37:58,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [12:00<1:15:50, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [12:00<1:15:50, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [12:00<1:15:50, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 73/509 [12:00<1:15:50, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:10<1:15:25, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:10<1:15:25, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:10<1:15:25, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:10<1:15:25, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 74/509 [12:10<1:15:25, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:21<1:15:32, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:21<1:15:32, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:21<1:15:32, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:21<1:15:32, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 75/509 [12:21<1:15:32, 10.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:31<1:14:41, 10.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:31<1:14:41, 10.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:31<1:14:41, 10.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:31<1:14:41, 10.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 76/509 [12:31<1:14:41, 10.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 77/509 [12:41<1:13:31, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 77/509 [12:41<1:13:31, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 77/509 [12:41<1:13:31, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 77/509 [12:41<1:13:31, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▎ | 78/509 [12:51<1:12:40, 10.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▎ | 78/509 [12:51<1:12:40, 10.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2623, 'learning_rate': 4.56e-05, 'epoch': 0.15} 15%|████████████▎ | 78/509 [12:51<1:12:40, 10.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▎ | 78/509 [12:51<1:12:40, 10.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 79/509 [13:00<1:11:33, 9.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 79/509 [13:00<1:11:33, 9.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1395, 'learning_rate': 4.62e-05, 'epoch': 0.15} 16%|████████████▍ | 79/509 [13:00<1:11:33, 9.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 79/509 [13:00<1:11:33, 9.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 79/509 [13:00<1:11:33, 9.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 80/509 [13:10<1:10:19, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 80/509 [13:10<1:10:19, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 80/509 [13:10<1:10:19, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 80/509 [13:10<1:10:19, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 81/509 [13:19<1:09:36, 9.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 81/509 [13:19<1:09:36, 9.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2962, 'learning_rate': 4.7399999999999993e-05, 'epoch': 0.16} 16%|████████████▋ | 81/509 [13:19<1:09:36, 9.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 81/509 [13:19<1:09:36, 9.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2672, 'learning_rate': 4.7999999999999994e-05, 'epoch': 0.16} 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3444, 'learning_rate': 4.8599999999999995e-05, 'epoch': 0.16} 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 82/509 [13:29<1:08:27, 9.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 84/509 [13:47<1:06:37, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 84/509 [13:47<1:06:37, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2708, 'learning_rate': 4.9199999999999997e-05, 'epoch': 0.16} 17%|█████████████▏ | 84/509 [13:47<1:06:37, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 84/509 [13:47<1:06:37, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 85/509 [13:56<1:05:22, 9.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 85/509 [13:56<1:05:22, 9.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1827, 'learning_rate': 4.98e-05, 'epoch': 0.17} 17%|█████████████▎ | 85/509 [13:56<1:05:22, 9.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 85/509 [13:56<1:05:22, 9.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 85/509 [13:56<1:05:22, 9.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 86/509 [14:05<1:04:28, 9.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 86/509 [14:05<1:04:28, 9.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:40:18,052 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:40:18,052 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:13<1:03:21, 9.01s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:13<1:03:21, 9.01s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:13<1:03:21, 9.01s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:13<1:03:21, 9.01s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 87/509 [14:13<1:03:21, 9.01s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:22<1:02:01, 8.84s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:22<1:02:01, 8.84s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:22<1:02:01, 8.84s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:22<1:02:01, 8.84s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 88/509 [14:22<1:02:01, 8.84s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 89/509 [14:30<1:00:36, 8.66s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 89/509 [14:30<1:00:36, 8.66s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 89/509 [14:30<1:00:36, 8.66s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 89/509 [14:30<1:00:36, 8.66s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 89/509 [14:30<1:00:36, 8.66s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 90/509 [14:38<58:43, 8.41s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 90/509 [14:38<58:43, 8.41s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:40:50,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:40:50,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:45<56:10, 8.06s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:45<56:10, 8.06s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:45<56:10, 8.06s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:45<56:10, 8.06s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▋ | 91/509 [14:45<56:10, 8.06s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▊ | 92/509 [14:52<53:15, 7.66s/it]g-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:01,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:01,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:01,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:38:09,116 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▉ | 93/509 [14:58<50:06, 7.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▉ | 93/509 [14:58<50:06, 7.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:10,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:10,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4964, 'learning_rate': 5.519999999999999e-05, 'epoch': 0.18} [WARNING|modeling_utils.py:388] 2022-03-02 22:41:14,386 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:14,386 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 95/509 [15:09<43:10, 6.26s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:18,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:20,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:20,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:22,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:24,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:24,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:26,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:28,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:28,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:30,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:30,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:33,012 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:34,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:34,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2287, 'learning_rate': 5.88e-05, 'epoch': 0.2} [WARNING|modeling_utils.py:388] 2022-03-02 22:41:40,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:41:40,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:40<43:46, 6.44s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:40<43:46, 6.44s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2066, 'learning_rate': 5.94e-05, 'epoch': 0.2} 20%|████████████████ | 101/509 [15:40<43:46, 6.44s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 101/509 [15:40<43:46, 6.44s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:52<54:26, 8.03s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:52<54:26, 8.03s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2299, 'learning_rate': 5.9999999999999995e-05, 'epoch': 0.2} 20%|████████████████▏ | 102/509 [15:52<54:26, 8.03s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:52<54:26, 8.03s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 102/509 [15:52<54:26, 8.03s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 103/509 [16:04<1:01:45, 9.13s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 103/509 [16:04<1:01:45, 9.13s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 103/509 [16:04<1:01:45, 9.13s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 103/509 [16:04<1:01:45, 9.13s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 104/509 [16:15<1:06:28, 9.85s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 104/509 [16:15<1:06:28, 9.85s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0859, 'learning_rate': 6.12e-05, 'epoch': 0.2} 20%|████████████████▏ | 104/509 [16:15<1:06:28, 9.85s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 104/509 [16:15<1:06:28, 9.85s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 105/509 [16:26<1:09:10, 10.27s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 105/509 [16:26<1:09:10, 10.27s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0669, 'learning_rate': 6.18e-05, 'epoch': 0.21} 21%|████████████████▎ | 105/509 [16:26<1:09:10, 10.27s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 105/509 [16:26<1:09:10, 10.27s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 106/509 [16:38<1:11:11, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 106/509 [16:38<1:11:11, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0643, 'learning_rate': 6.239999999999999e-05, 'epoch': 0.21} 21%|████████████████▍ | 106/509 [16:38<1:11:11, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 106/509 [16:38<1:11:11, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 107/509 [16:49<1:12:19, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 107/509 [16:49<1:12:19, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1299, 'learning_rate': 6.299999999999999e-05, 'epoch': 0.21} 21%|████████████████▌ | 107/509 [16:49<1:12:19, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 107/509 [16:49<1:12:19, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [17:00<1:12:58, 10.92s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [17:00<1:12:58, 10.92s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1738, 'learning_rate': 6.359999999999999e-05, 'epoch': 0.21} 21%|████████████████▊ | 108/509 [17:00<1:12:58, 10.92s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 108/509 [17:00<1:12:58, 10.92s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:11<1:12:59, 10.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:11<1:12:59, 10.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2388, 'learning_rate': 6.419999999999999e-05, 'epoch': 0.21} 21%|████████████████▉ | 109/509 [17:11<1:12:59, 10.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 109/509 [17:11<1:12:59, 10.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:22<1:13:08, 11.00s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:22<1:13:08, 11.00s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1986, 'learning_rate': 6.479999999999999e-05, 'epoch': 0.22} 22%|█████████████████ | 110/509 [17:22<1:13:08, 11.00s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:22<1:13:08, 11.00s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 110/509 [17:22<1:13:08, 11.00s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:33<1:12:49, 10.98s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:33<1:12:49, 10.98s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:33<1:12:49, 10.98s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:33<1:12:49, 10.98s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 111/509 [17:33<1:12:49, 10.98s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:44<1:12:30, 10.96s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:44<1:12:30, 10.96s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:44<1:12:30, 10.96s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:44<1:12:30, 10.96s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 112/509 [17:44<1:12:30, 10.96s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:55<1:12:10, 10.93s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:55<1:12:10, 10.93s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:55<1:12:10, 10.93s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 113/509 [17:55<1:12:10, 10.93s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1904, 'learning_rate': 6.72e-05, 'epoch': 0.22} 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1663, 'learning_rate': 6.78e-05, 'epoch': 0.23} 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 114/509 [18:06<1:11:35, 10.87s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:27<1:10:39, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:27<1:10:39, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:27<1:10:39, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 116/509 [18:27<1:10:39, 10.79s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:38<1:10:15, 10.75s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:38<1:10:15, 10.75s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2877, 'learning_rate': 6.9e-05, 'epoch': 0.23} 23%|██████████████████▏ | 117/509 [18:38<1:10:15, 10.75s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 117/509 [18:38<1:10:15, 10.75s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 118/509 [18:48<1:09:46, 10.71s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 118/509 [18:48<1:09:46, 10.71s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2288, 'learning_rate': 6.96e-05, 'epoch': 0.23} 23%|██████████████████▎ | 118/509 [18:48<1:09:46, 10.71s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 118/509 [18:48<1:09:46, 10.71s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 119/509 [18:59<1:09:16, 10.66s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 119/509 [18:59<1:09:16, 10.66s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1526, 'learning_rate': 7.02e-05, 'epoch': 0.23} 23%|██████████████████▍ | 119/509 [18:59<1:09:16, 10.66s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 119/509 [18:59<1:09:16, 10.66s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▍ | 119/509 [18:59<1:09:16, 10.66s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:09<1:08:41, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:09<1:08:41, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:09<1:08:41, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:09<1:08:41, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 120/509 [19:09<1:08:41, 10.60s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:20<1:08:12, 10.55s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:20<1:08:12, 10.55s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:20<1:08:12, 10.55s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:20<1:08:12, 10.55s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 121/509 [19:20<1:08:12, 10.55s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:30<1:07:38, 10.49s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:30<1:07:38, 10.49s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:30<1:07:38, 10.49s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:30<1:07:38, 10.49s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 122/509 [19:30<1:07:38, 10.49s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:41<1:07:06, 10.43s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:41<1:07:06, 10.43s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:41<1:07:06, 10.43s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:41<1:07:06, 10.43s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 123/509 [19:41<1:07:06, 10.43s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:51<1:06:12, 10.32s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:51<1:06:12, 10.32s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:51<1:06:12, 10.32s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:51<1:06:12, 10.32s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 124/509 [19:51<1:06:12, 10.32s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [20:01<1:06:18, 10.36s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [20:01<1:06:18, 10.36s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [20:01<1:06:18, 10.36s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [20:01<1:06:18, 10.36s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▍ | 125/509 [20:01<1:06:18, 10.36s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:11<1:05:31, 10.26s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:11<1:05:31, 10.26s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:11<1:05:31, 10.26s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:11<1:05:31, 10.26s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▌ | 126/509 [20:11<1:05:31, 10.26s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.162, 'learning_rate': 7.56e-05, 'epoch': 0.25} 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 127/509 [20:21<1:04:34, 10.14s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:41<1:02:59, 9.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:41<1:02:59, 9.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1419, 'learning_rate': 7.62e-05, 'epoch': 0.25} 25%|████████████████████ | 129/509 [20:41<1:02:59, 9.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 129/509 [20:41<1:02:59, 9.95s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 130/509 [20:50<1:02:03, 9.82s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 130/509 [20:50<1:02:03, 9.82s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.338, 'learning_rate': 7.68e-05, 'epoch': 0.26} 26%|████████████████████▏ | 130/509 [20:50<1:02:03, 9.82s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 130/509 [20:50<1:02:03, 9.82s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1503, 'learning_rate': 7.74e-05, 'epoch': 0.26} g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 132/509 [21:09<1:00:22, 9.61s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 132/509 [21:09<1:00:22, 9.61s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.163, 'learning_rate': 7.8e-05, 'epoch': 0.26} 26%|████████████████████▍ | 132/509 [21:09<1:00:22, 9.61s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 132/509 [21:09<1:00:22, 9.61s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▏ | 133/509 [21:18<59:22, 9.47s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▏ | 133/509 [21:18<59:22, 9.47s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1842, 'learning_rate': 7.86e-05, 'epoch': 0.26} 26%|█████████████████████▏ | 133/509 [21:18<59:22, 9.47s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:47:33,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:47:33,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3043, 'learning_rate': 7.92e-05, 'epoch': 0.26} [WARNING|modeling_utils.py:388] 2022-03-02 22:47:33,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:47:33,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:47:33,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:47:33,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 135/509 [21:36<57:15, 9.18s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 135/509 [21:36<57:15, 9.18s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 135/509 [21:36<57:15, 9.18s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 135/509 [21:36<57:15, 9.18s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 136/509 [21:45<56:27, 9.08s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 136/509 [21:45<56:27, 9.08s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2666, 'learning_rate': 8.04e-05, 'epoch': 0.27} 27%|█████████████████████▋ | 136/509 [21:45<56:27, 9.08s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 136/509 [21:45<56:27, 9.08s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 137/509 [21:53<55:09, 8.90s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 137/509 [21:53<55:09, 8.90s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2247, 'learning_rate': 8.1e-05, 'epoch': 0.27} 27%|█████████████████████▊ | 137/509 [21:53<55:09, 8.90s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 137/509 [21:53<55:09, 8.90s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [22:01<53:49, 8.70s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [22:01<53:49, 8.70s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2554, 'learning_rate': 8.16e-05, 'epoch': 0.27} 27%|█████████████████████▉ | 138/509 [22:01<53:49, 8.70s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [22:01<53:49, 8.70s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 138/509 [22:01<53:49, 8.70s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:09<52:18, 8.48s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:09<52:18, 8.48s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:09<52:18, 8.48s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:09<52:18, 8.48s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 139/509 [22:09<52:18, 8.48s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 140/509 [22:17<50:27, 8.20s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 140/509 [22:17<50:27, 8.20s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 140/509 [22:17<50:27, 8.20s/it]g-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:30,811 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:30,811 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3292, 'learning_rate': 8.34e-05, 'epoch': 0.28} [WARNING|modeling_utils.py:388] 2022-03-02 22:48:30,811 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:30,811 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:30,811 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:41:06,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:31<46:04, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:48:39,190 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:31<46:04, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:48:39,190 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:31<46:04, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:48:39,190 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▌ | 142/509 [22:31<46:04, 7.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:48:39,190 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 143/509 [22:37<43:29, 7.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 143/509 [22:37<43:29, 7.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:49,359 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:49,359 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2392, 'learning_rate': 8.519999999999998e-05, 'epoch': 0.28} [WARNING|modeling_utils.py:388] 2022-03-02 22:48:53,299 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:53,299 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████ | 145/509 [22:48<37:48, 6.23s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:56,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:59,157 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:48:59,157 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:01,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:03,157 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:03,157 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:05,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:05,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:06,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:09,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:09,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:11,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:11,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:12,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:12,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:19,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:49:19,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:18<38:16, 6.41s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:18<38:16, 6.41s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3844, 'learning_rate': 8.939999999999999e-05, 'epoch': 0.3} 30%|████████████████████████ | 151/509 [23:18<38:16, 6.41s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:18<38:16, 6.41s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 151/509 [23:18<38:16, 6.41s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 152/509 [23:30<47:37, 8.00s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 152/509 [23:30<47:37, 8.00s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 152/509 [23:30<47:37, 8.00s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 152/509 [23:30<47:37, 8.00s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2272, 'learning_rate': 9.059999999999999e-05, 'epoch': 0.3} g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:53<58:04, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:53<58:04, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2883, 'learning_rate': 9.12e-05, 'epoch': 0.3} 30%|████████████████████████▌ | 154/509 [23:53<58:04, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:53<58:04, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 154/509 [23:53<58:04, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [24:05<1:00:53, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [24:05<1:00:53, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [24:05<1:00:53, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████ | 155/509 [24:05<1:00:53, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:16<1:02:35, 10.64s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:16<1:02:35, 10.64s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1388, 'learning_rate': 9.24e-05, 'epoch': 0.31} 31%|████████████████████████▏ | 156/509 [24:16<1:02:35, 10.64s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:16<1:02:35, 10.64s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 156/509 [24:16<1:02:35, 10.64s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 157/509 [24:27<1:03:24, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 157/509 [24:27<1:03:24, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 157/509 [24:27<1:03:24, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 157/509 [24:27<1:03:24, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 158/509 [24:39<1:03:52, 10.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 158/509 [24:39<1:03:52, 10.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.32, 'learning_rate': 9.36e-05, 'epoch': 0.31} 31%|████████████████████████▌ | 158/509 [24:39<1:03:52, 10.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 158/509 [24:39<1:03:52, 10.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 159/509 [24:50<1:04:17, 11.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 159/509 [24:50<1:04:17, 11.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2029, 'learning_rate': 9.419999999999999e-05, 'epoch': 0.31} 31%|████████████████████████▋ | 159/509 [24:50<1:04:17, 11.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 159/509 [24:50<1:04:17, 11.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [25:01<1:04:26, 11.08s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [25:01<1:04:26, 11.08s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3667, 'learning_rate': 9.479999999999999e-05, 'epoch': 0.31} 31%|████████████████████████▊ | 160/509 [25:01<1:04:26, 11.08s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [25:01<1:04:26, 11.08s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 160/509 [25:01<1:04:26, 11.08s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:12<1:04:12, 11.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:12<1:04:12, 11.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:12<1:04:12, 11.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 161/509 [25:12<1:04:12, 11.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:23<1:03:39, 11.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:23<1:03:39, 11.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2337, 'learning_rate': 9.599999999999999e-05, 'epoch': 0.32} 32%|█████████████████████████▏ | 162/509 [25:23<1:03:39, 11.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:23<1:03:39, 11.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 162/509 [25:23<1:03:39, 11.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:34<1:03:21, 10.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:34<1:03:21, 10.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:34<1:03:21, 10.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:34<1:03:21, 10.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 163/509 [25:34<1:03:21, 10.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:45<1:02:53, 10.94s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:45<1:02:53, 10.94s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:45<1:02:53, 10.94s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▍ | 164/509 [25:45<1:02:53, 10.94s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:56<1:02:32, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:56<1:02:32, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1561, 'learning_rate': 9.779999999999999e-05, 'epoch': 0.32} 32%|█████████████████████████▌ | 165/509 [25:56<1:02:32, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:56<1:02:32, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▌ | 165/509 [25:56<1:02:32, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:06<1:02:09, 10.87s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:06<1:02:09, 10.87s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:06<1:02:09, 10.87s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:06<1:02:09, 10.87s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▊ | 166/509 [26:06<1:02:09, 10.87s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:17<1:01:39, 10.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:17<1:01:39, 10.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:17<1:01:39, 10.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:17<1:01:39, 10.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▉ | 167/509 [26:17<1:01:39, 10.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 168/509 [26:28<1:01:02, 10.74s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 168/509 [26:28<1:01:02, 10.74s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 168/509 [26:28<1:01:02, 10.74s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████ | 168/509 [26:28<1:01:02, 10.74s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:38<1:00:32, 10.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:38<1:00:32, 10.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2635, 'learning_rate': 0.0001002, 'epoch': 0.33} 33%|██████████████████████████▏ | 169/509 [26:38<1:00:32, 10.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 169/509 [26:38<1:00:32, 10.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 170/509 [26:48<59:44, 10.57s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 170/509 [26:48<59:44, 10.57s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2645, 'learning_rate': 0.0001008, 'epoch': 0.33} 33%|███████████████████████████ | 170/509 [26:48<59:44, 10.57s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 170/509 [26:48<59:44, 10.57s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 171/509 [26:59<59:04, 10.49s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 171/509 [26:59<59:04, 10.49s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2651, 'learning_rate': 0.0001014, 'epoch': 0.34} 34%|███████████████████████████▏ | 171/509 [26:59<59:04, 10.49s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 171/509 [26:59<59:04, 10.49s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 172/509 [27:09<58:22, 10.39s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 172/509 [27:09<58:22, 10.39s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1348, 'learning_rate': 0.000102, 'epoch': 0.34} 34%|███████████████████████████▎ | 172/509 [27:09<58:22, 10.39s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 172/509 [27:09<58:22, 10.39s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:19<57:47, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:19<57:47, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1364, 'learning_rate': 0.0001026, 'epoch': 0.34} 34%|███████████████████████████▌ | 173/509 [27:19<57:47, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:19<57:47, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 173/509 [27:19<57:47, 10.32s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 174/509 [27:29<57:12, 10.25s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 174/509 [27:29<57:12, 10.25s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 174/509 [27:29<57:12, 10.25s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 174/509 [27:29<57:12, 10.25s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 174/509 [27:29<57:12, 10.25s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:40<57:21, 10.30s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:40<57:21, 10.30s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:40<57:21, 10.30s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:40<57:21, 10.30s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 175/509 [27:40<57:21, 10.30s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 176/509 [27:49<56:23, 10.16s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 176/509 [27:49<56:23, 10.16s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 176/509 [27:49<56:23, 10.16s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 176/509 [27:49<56:23, 10.16s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:59<55:25, 10.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:59<55:25, 10.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1352, 'learning_rate': 0.00010499999999999999, 'epoch': 0.35} 35%|████████████████████████████▏ | 177/509 [27:59<55:25, 10.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 177/509 [27:59<55:25, 10.02s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 178/509 [28:09<54:44, 9.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 178/509 [28:09<54:44, 9.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2304, 'learning_rate': 0.00010559999999999998, 'epoch': 0.35} 35%|████████████████████████████▎ | 178/509 [28:09<54:44, 9.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 178/509 [28:09<54:44, 9.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 178/509 [28:09<54:44, 9.92s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:18<54:01, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:18<54:01, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:18<54:01, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 179/509 [28:18<54:01, 9.82s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 180/509 [28:28<53:05, 9.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 180/509 [28:28<53:05, 9.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3465, 'learning_rate': 0.00010679999999999998, 'epoch': 0.35} 35%|████████████████████████████▋ | 180/509 [28:28<53:05, 9.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 180/509 [28:28<53:05, 9.68s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 181/509 [28:37<52:35, 9.62s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 181/509 [28:37<52:35, 9.62s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1655, 'learning_rate': 0.00010739999999999998, 'epoch': 0.36} 36%|████████████████████████████▊ | 181/509 [28:37<52:35, 9.62s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▊ | 181/509 [28:37<52:35, 9.62s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 182/509 [28:47<51:56, 9.53s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 182/509 [28:47<51:56, 9.53s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:54:57,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:54:57,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 183/509 [28:56<50:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 183/509 [28:56<50:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1614, 'learning_rate': 0.00010859999999999998, 'epoch': 0.36} 36%|█████████████████████████████ | 183/509 [28:56<50:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 183/509 [28:56<50:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [29:04<50:02, 9.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [29:04<50:02, 9.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2064, 'learning_rate': 0.00010919999999999998, 'epoch': 0.36} 36%|█████████████████████████████▎ | 184/509 [29:04<50:02, 9.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [29:04<50:02, 9.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [29:04<50:02, 9.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.326, 'learning_rate': 0.00010979999999999999, 'epoch': 0.36} 36%|█████████████████████████████▎ | 184/509 [29:04<50:02, 9.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 184/509 [29:04<50:02, 9.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:55:26,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 186/509 [29:22<48:25, 8.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 186/509 [29:22<48:25, 8.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1589, 'learning_rate': 0.00011039999999999999, 'epoch': 0.36} 37%|█████████████████████████████▌ | 186/509 [29:22<48:25, 8.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:55:37,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:55:37,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2655, 'learning_rate': 0.00011099999999999999, 'epoch': 0.37} [WARNING|modeling_utils.py:388] 2022-03-02 22:55:37,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:55:37,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:39<46:03, 8.61s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:39<46:03, 8.61s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2708, 'learning_rate': 0.00011159999999999999, 'epoch': 0.37} 37%|█████████████████████████████▉ | 188/509 [29:39<46:03, 8.61s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▉ | 188/509 [29:39<46:03, 8.61s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:46<44:48, 8.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:46<44:48, 8.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3677, 'learning_rate': 0.00011219999999999999, 'epoch': 0.37} 37%|██████████████████████████████ | 189/509 [29:46<44:48, 8.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 189/509 [29:46<44:48, 8.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 190/509 [29:54<43:17, 8.14s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 190/509 [29:54<43:17, 8.14s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2505, 'learning_rate': 0.00011279999999999999, 'epoch': 0.37} [WARNING|modeling_utils.py:388] 2022-03-02 22:56:05,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [30:01<41:21, 7.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [30:01<41:21, 7.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4483, 'learning_rate': 0.00011339999999999999, 'epoch': 0.37} 38%|██████████████████████████████▍ | 191/509 [30:01<41:21, 7.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [30:01<41:21, 7.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 191/509 [30:01<41:21, 7.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 192/509 [30:08<39:21, 7.45s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:17,427 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:17,427 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:17,427 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 193/509 [30:14<37:07, 7.05s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:23,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:26,076 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:26,076 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3805, 'learning_rate': 0.0001152, 'epoch': 0.38} [WARNING|modeling_utils.py:388] 2022-03-02 22:56:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 195/509 [30:25<32:28, 6.21s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 195/509 [30:25<32:28, 6.21s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:33,678 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:35,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:35,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:37,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:39,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:39,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:41,734 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:43,445 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:43,445 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:45,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:45,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:47,888 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:49,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:49,568 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6075, 'learning_rate': 0.0001188, 'epoch': 0.39} [WARNING|modeling_utils.py:388] 2022-03-02 22:56:55,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 22:56:55,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:55<32:51, 6.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:55<32:51, 6.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3892, 'learning_rate': 0.0001194, 'epoch': 0.39} 39%|███████████████████████████████▉ | 201/509 [30:55<32:51, 6.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 201/509 [30:55<32:51, 6.40s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 202/509 [31:07<40:58, 8.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 202/509 [31:07<40:58, 8.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3396, 'learning_rate': 0.00011999999999999999, 'epoch': 0.4} 40%|████████████████████████████████▏ | 202/509 [31:07<40:58, 8.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 202/509 [31:07<40:58, 8.01s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 203/509 [31:18<46:14, 9.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 203/509 [31:18<46:14, 9.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3076, 'learning_rate': 0.00012059999999999999, 'epoch': 0.4} 40%|████████████████████████████████▎ | 203/509 [31:18<46:14, 9.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 203/509 [31:18<46:14, 9.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:30<49:41, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:30<49:41, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3174, 'learning_rate': 0.00012119999999999999, 'epoch': 0.4} 40%|████████████████████████████████▍ | 204/509 [31:30<49:41, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:30<49:41, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 204/509 [31:30<49:41, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:41<51:50, 10.23s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:41<51:50, 10.23s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:41<51:50, 10.23s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:41<51:50, 10.23s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 205/509 [31:41<51:50, 10.23s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1905, 'learning_rate': 0.00012299999999999998, 'epoch': 0.41} 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 206/509 [31:53<53:20, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:15<54:26, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:15<54:26, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:15<54:26, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 208/509 [32:15<54:26, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2853, 'learning_rate': 0.00012419999999999998, 'epoch': 0.41} 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4103, 'learning_rate': 0.00012479999999999997, 'epoch': 0.41} 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 209/509 [32:26<54:33, 10.91s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:48<54:17, 10.93s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:48<54:17, 10.93s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:48<54:17, 10.93s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 211/509 [32:48<54:17, 10.93s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▋ | 212/509 [32:59<53:52, 10.88s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▋ | 212/509 [32:59<53:52, 10.88s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2824, 'learning_rate': 0.00012599999999999997, 'epoch': 0.42} 42%|█████████████████████████████████▋ | 212/509 [32:59<53:52, 10.88s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▋ | 212/509 [32:59<53:52, 10.88s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 213/509 [33:09<53:31, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 213/509 [33:09<53:31, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3676, 'learning_rate': 0.0001266, 'epoch': 0.42} 42%|█████████████████████████████████▉ | 213/509 [33:09<53:31, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 213/509 [33:09<53:31, 10.85s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3831, 'learning_rate': 0.00012719999999999997, 'epoch': 0.42} 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2179, 'learning_rate': 0.0001278, 'epoch': 0.42} 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 214/509 [33:20<53:09, 10.81s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 216/509 [33:41<52:14, 10.70s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 216/509 [33:41<52:14, 10.70s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 216/509 [33:41<52:14, 10.70s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 216/509 [33:41<52:14, 10.70s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 216/509 [33:41<52:14, 10.70s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 217/509 [33:52<51:45, 10.63s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 217/509 [33:52<51:45, 10.63s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 217/509 [33:52<51:45, 10.63s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 217/509 [33:52<51:45, 10.63s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 217/509 [33:52<51:45, 10.63s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [34:02<51:13, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [34:02<51:13, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [34:02<51:13, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [34:02<51:13, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 218/509 [34:02<51:13, 10.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:12<50:43, 10.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:12<50:43, 10.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:12<50:43, 10.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:12<50:43, 10.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 219/509 [34:12<50:43, 10.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:23<50:20, 10.45s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:23<50:20, 10.45s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:23<50:20, 10.45s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:23<50:20, 10.45s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 220/509 [34:23<50:20, 10.45s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:33<49:44, 10.36s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:33<49:44, 10.36s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:33<49:44, 10.36s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:33<49:44, 10.36s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 221/509 [34:33<49:44, 10.36s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:43<49:11, 10.28s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:43<49:11, 10.28s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:43<49:11, 10.28s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:43<49:11, 10.28s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 222/509 [34:43<49:11, 10.28s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 223/509 [34:53<48:47, 10.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 223/509 [34:53<48:47, 10.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 223/509 [34:53<48:47, 10.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 223/509 [34:53<48:47, 10.24s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 224/509 [35:03<48:18, 10.17s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 224/509 [35:03<48:18, 10.17s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3202, 'learning_rate': 0.00013319999999999999, 'epoch': 0.44} 44%|███████████████████████████████████▋ | 224/509 [35:03<48:18, 10.17s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 224/509 [35:03<48:18, 10.17s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:14<48:42, 10.29s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:14<48:42, 10.29s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4132, 'learning_rate': 0.0001338, 'epoch': 0.44} 44%|███████████████████████████████████▊ | 225/509 [35:14<48:42, 10.29s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:14<48:42, 10.29s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:14<48:42, 10.29s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 225/509 [35:14<48:42, 10.29s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 226/509 [35:24<48:03, 10.19s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 226/509 [35:24<48:03, 10.19s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 226/509 [35:24<48:03, 10.19s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 226/509 [35:24<48:03, 10.19s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 227/509 [35:33<47:18, 10.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 227/509 [35:33<47:18, 10.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4208, 'learning_rate': 0.000135, 'epoch': 0.45} 45%|████████████████████████████████████ | 227/509 [35:33<47:18, 10.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 227/509 [35:33<47:18, 10.07s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 228/509 [35:43<46:40, 9.97s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 228/509 [35:43<46:40, 9.97s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1902, 'learning_rate': 0.0001356, 'epoch': 0.45} 45%|████████████████████████████████████▎ | 228/509 [35:43<46:40, 9.97s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 228/509 [35:43<46:40, 9.97s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 229/509 [35:53<46:02, 9.86s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 229/509 [35:53<46:02, 9.86s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4058, 'learning_rate': 0.0001362, 'epoch': 0.45} 45%|████████████████████████████████████▍ | 229/509 [35:53<46:02, 9.86s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 229/509 [35:53<46:02, 9.86s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 230/509 [36:02<45:27, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 230/509 [36:02<45:27, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4925, 'learning_rate': 0.0001368, 'epoch': 0.45} 45%|████████████████████████████████████▌ | 230/509 [36:02<45:27, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 230/509 [36:02<45:27, 9.78s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:12<44:47, 9.67s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:12<44:47, 9.67s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3905, 'learning_rate': 0.0001374, 'epoch': 0.45} 45%|████████████████████████████████████▊ | 231/509 [36:12<44:47, 9.67s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:12<44:47, 9.67s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 231/509 [36:12<44:47, 9.67s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:21<44:07, 9.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:21<44:07, 9.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:21<44:07, 9.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:21<44:07, 9.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 232/509 [36:21<44:07, 9.56s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:30<43:40, 9.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:30<43:40, 9.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:30<43:40, 9.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 233/509 [36:30<43:40, 9.50s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 234/509 [36:40<42:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 234/509 [36:40<42:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1011, 'learning_rate': 0.0001392, 'epoch': 0.46} 46%|█████████████████████████████████████▏ | 234/509 [36:40<42:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 234/509 [36:40<42:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 234/509 [36:40<42:58, 9.38s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:49<42:19, 9.27s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:49<42:19, 9.27s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:49<42:19, 9.27s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:49<42:19, 9.27s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 235/509 [36:49<42:19, 9.27s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 236/509 [36:57<41:38, 9.15s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 236/509 [36:57<41:38, 9.15s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 236/509 [36:57<41:38, 9.15s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 236/509 [36:57<41:38, 9.15s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 237/509 [37:06<40:46, 8.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 237/509 [37:06<40:46, 8.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3442, 'learning_rate': 0.00014099999999999998, 'epoch': 0.46} 47%|█████████████████████████████████████▋ | 237/509 [37:06<40:46, 8.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 237/509 [37:06<40:46, 8.99s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 238/509 [37:14<39:43, 8.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 238/509 [37:14<39:43, 8.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1006, 'learning_rate': 0.00014159999999999997, 'epoch': 0.47} 47%|█████████████████████████████████████▊ | 238/509 [37:14<39:43, 8.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 238/509 [37:14<39:43, 8.80s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:23<38:39, 8.59s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:23<38:39, 8.59s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4731, 'learning_rate': 0.0001422, 'epoch': 0.47} 47%|██████████████████████████████████████ | 239/509 [37:23<38:39, 8.59s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 239/509 [37:23<38:39, 8.59s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:30<37:22, 8.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:30<37:22, 8.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3852, 'learning_rate': 0.00014279999999999997, 'epoch': 0.47} 47%|██████████████████████████████████████▏ | 240/509 [37:30<37:22, 8.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:30<37:22, 8.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 240/509 [37:30<37:22, 8.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▎ | 241/509 [37:38<35:55, 8.04s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:03:47,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:03:47,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 242/509 [37:45<34:20, 7.72s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 242/509 [37:45<34:20, 7.72s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3491, 'learning_rate': 0.00014399999999999998, 'epoch': 0.47} [WARNING|modeling_utils.py:388] 2022-03-02 23:03:56,168 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 243/509 [37:51<32:32, 7.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 243/509 [37:51<32:32, 7.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.382, 'learning_rate': 0.0001446, 'epoch': 0.48} 48%|██████████████████████████████████████▋ | 243/509 [37:51<32:32, 7.34s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:03,824 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:03,824 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3693, 'learning_rate': 0.00014519999999999998, 'epoch': 0.48} [WARNING|modeling_utils.py:388] 2022-03-02 23:04:08,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:08,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▉ | 245/509 [38:03<28:42, 6.52s/it]g-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:11,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:14,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:14,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3659, 'learning_rate': 0.00014639999999999998, 'epoch': 0.48} [WARNING|modeling_utils.py:388] 2022-03-02 23:04:17,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:17,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 22:48:45,255 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▎ | 247/509 [38:12<24:12, 5.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:19,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:21,577 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:19,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:21,577 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:19,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▍ | 248/509 [38:16<21:51, 5.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:23,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▍ | 248/509 [38:16<21:51, 5.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:23,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▌ | 249/509 [38:19<19:24, 4.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:26,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:27,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:26,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:27,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:26,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▊ | 250/509 [38:22<17:37, 4.08s/it]g-point operations will not be computed-02 23:04:26,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▊ | 250/509 [38:22<17:37, 4.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▊ | 250/509 [38:22<17:37, 4.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▊ | 250/509 [38:22<17:37, 4.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▊ | 250/509 [38:22<17:37, 4.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:34<28:05, 6.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:34<28:05, 6.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:34<28:05, 6.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 251/509 [38:34<28:05, 6.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████ | 252/509 [38:46<34:46, 8.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████ | 252/509 [38:46<34:46, 8.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4688, 'learning_rate': 0.00015, 'epoch': 0.49} 50%|████████████████████████████████████████ | 252/509 [38:46<34:46, 8.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████ | 252/509 [38:46<34:46, 8.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 253/509 [38:58<39:09, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 253/509 [38:58<39:09, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4845, 'learning_rate': 0.00015059999999999997, 'epoch': 0.5} 50%|████████████████████████████████████████▎ | 253/509 [38:58<39:09, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 253/509 [38:58<39:09, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 253/509 [38:58<39:09, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:10<42:14, 9.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:10<42:14, 9.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:10<42:14, 9.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 254/509 [39:10<42:14, 9.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:21<44:02, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:21<44:02, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4156, 'learning_rate': 0.00015179999999999998, 'epoch': 0.5} 50%|████████████████████████████████████████▌ | 255/509 [39:21<44:02, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:21<44:02, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 255/509 [39:21<44:02, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:32<45:10, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:32<45:10, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:32<45:10, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:32<45:10, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 256/509 [39:32<45:10, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▉ | 257/509 [39:44<45:49, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▉ | 257/509 [39:44<45:49, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▉ | 257/509 [39:44<45:49, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▉ | 257/509 [39:44<45:49, 10.91s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4527, 'learning_rate': 0.0001536, 'epoch': 0.51} [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [40:06<46:03, 11.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [40:06<46:03, 11.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [40:06<46:03, 11.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 259/509 [40:06<46:03, 11.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:17<46:00, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:17<46:00, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4085, 'learning_rate': 0.0001548, 'epoch': 0.51} 51%|█████████████████████████████████████████▍ | 260/509 [40:17<46:00, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:17<46:00, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 260/509 [40:17<46:00, 11.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 261/509 [40:28<45:36, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 261/509 [40:28<45:36, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 261/509 [40:28<45:36, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 261/509 [40:28<45:36, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:39<45:24, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:39<45:24, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2291, 'learning_rate': 0.000156, 'epoch': 0.51} 51%|█████████████████████████████████████████▋ | 262/509 [40:39<45:24, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▋ | 262/509 [40:39<45:24, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3193, 'learning_rate': 0.00015659999999999998, 'epoch': 0.52} [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 264/509 [41:01<44:50, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 264/509 [41:01<44:50, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 264/509 [41:01<44:50, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 264/509 [41:01<44:50, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 265/509 [41:12<44:24, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 265/509 [41:12<44:24, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4207, 'learning_rate': 0.0001578, 'epoch': 0.52} 52%|██████████████████████████████████████████▏ | 265/509 [41:12<44:24, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 265/509 [41:12<44:24, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 265/509 [41:12<44:24, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:23<44:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:23<44:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:23<44:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:23<44:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 266/509 [41:23<44:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3651, 'learning_rate': 0.0001596, 'epoch': 0.53} 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 267/509 [41:34<43:40, 10.83s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:55<42:54, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:55<42:54, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:55<42:54, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:55<42:54, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 269/509 [41:55<42:54, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [42:05<42:24, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [42:05<42:24, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [42:05<42:24, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [42:05<42:24, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 270/509 [42:05<42:24, 10.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:16<41:50, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:16<41:50, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:16<41:50, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:16<41:50, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 271/509 [42:16<41:50, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:26<41:22, 10.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:26<41:22, 10.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:26<41:22, 10.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:26<41:22, 10.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 272/509 [42:26<41:22, 10.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:36<40:55, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:36<40:55, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:36<40:55, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:36<40:55, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▍ | 273/509 [42:36<40:55, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:46<40:22, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:46<40:22, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:46<40:22, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 274/509 [42:46<40:22, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:57<40:37, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:57<40:37, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4815, 'learning_rate': 0.0001638, 'epoch': 0.54} 54%|███████████████████████████████████████████▊ | 275/509 [42:57<40:37, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:57<40:37, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 275/509 [42:57<40:37, 10.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 276/509 [43:07<40:01, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 276/509 [43:07<40:01, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 276/509 [43:07<40:01, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 276/509 [43:07<40:01, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6488, 'learning_rate': 0.000165, 'epoch': 0.54} 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4677, 'learning_rate': 0.0001656, 'epoch': 0.55} 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 277/509 [43:17<39:28, 10.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:37<38:25, 10.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:37<38:25, 10.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:37<38:25, 10.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:37<38:25, 10.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 279/509 [43:37<38:25, 10.02s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:46<37:46, 9.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:46<37:46, 9.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:46<37:46, 9.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 280/509 [43:46<37:46, 9.90s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:56<37:17, 9.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:56<37:17, 9.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4123, 'learning_rate': 0.0001674, 'epoch': 0.55} 55%|████████████████████████████████████████████▋ | 281/509 [43:56<37:17, 9.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:56<37:17, 9.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 281/509 [43:56<37:17, 9.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [44:05<36:36, 9.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [44:05<36:36, 9.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [44:05<36:36, 9.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▉ | 282/509 [44:05<36:36, 9.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:15<35:56, 9.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:15<35:56, 9.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5083, 'learning_rate': 0.0001686, 'epoch': 0.56} 56%|█████████████████████████████████████████████ | 283/509 [44:15<35:56, 9.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:15<35:56, 9.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 283/509 [44:15<35:56, 9.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:24<35:18, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:24<35:18, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:24<35:18, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 284/509 [44:24<35:18, 9.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:33<34:41, 9.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:33<34:41, 9.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4569, 'learning_rate': 0.00016979999999999998, 'epoch': 0.56} 56%|█████████████████████████████████████████████▎ | 285/509 [44:33<34:41, 9.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 285/509 [44:33<34:41, 9.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 286/509 [44:41<34:00, 9.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 286/509 [44:41<34:00, 9.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4092, 'learning_rate': 0.00017039999999999997, 'epoch': 0.56} 56%|█████████████████████████████████████████████▌ | 286/509 [44:41<34:00, 9.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 286/509 [44:41<34:00, 9.15s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 287/509 [44:50<33:16, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 287/509 [44:50<33:16, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4716, 'learning_rate': 0.00017099999999999998, 'epoch': 0.56} 56%|█████████████████████████████████████████████▋ | 287/509 [44:50<33:16, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 287/509 [44:50<33:16, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 287/509 [44:50<33:16, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:58<32:24, 8.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:58<32:24, 8.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:58<32:24, 8.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:58<32:24, 8.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 288/509 [44:58<32:24, 8.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [45:06<31:19, 8.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [45:06<31:19, 8.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [45:06<31:19, 8.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [45:06<31:19, 8.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 289/509 [45:06<31:19, 8.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▏ | 290/509 [45:14<29:58, 8.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▏ | 290/509 [45:14<29:58, 8.21s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:25,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 291/509 [45:21<28:33, 7.86s/it]g-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 291/509 [45:21<28:33, 7.86s/it]g-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2536, 'learning_rate': 0.00017339999999999996, 'epoch': 0.57} 57%|██████████████████████████████████████████████▎ | 291/509 [45:21<28:33, 7.86s/it]g-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:34,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:34,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5786, 'learning_rate': 0.00017399999999999997, 'epoch': 0.57} [WARNING|modeling_utils.py:388] 2022-03-02 23:11:34,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:34,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:34,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:04:32,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 293/509 [45:34<25:27, 7.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:11:41,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 293/509 [45:34<25:27, 7.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:11:41,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 293/509 [45:34<25:27, 7.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:11:41,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:45,822 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:41,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:45,822 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:41,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:49,536 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:41,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:49,536 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:41,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 295/509 [45:44<21:43, 6.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:11:51,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:53,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:51,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:53,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:51,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████ | 296/509 [45:48<19:44, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:11:56,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:58,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:56,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:11:58,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:56,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▎ | 297/509 [45:52<17:54, 5.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:11:59,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:12:01,588 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:59,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:12:01,588 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:11:59,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 298/509 [45:56<16:07, 4.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:03,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 298/509 [45:56<16:07, 4.59s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:03,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▌ | 299/509 [45:59<14:21, 4.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:06,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:12:07,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:06,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:12:07,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:06,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 300/509 [46:02<13:16, 3.81s/it]g-point operations will not be computed-02 23:12:06,144 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 300/509 [46:02<13:16, 3.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 300/509 [46:02<13:16, 3.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 300/509 [46:02<13:16, 3.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:14<21:53, 6.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:14<21:53, 6.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5427, 'learning_rate': 0.00017939999999999997, 'epoch': 0.59} 59%|███████████████████████████████████████████████▉ | 301/509 [46:14<21:53, 6.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:14<21:53, 6.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 301/509 [46:14<21:53, 6.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.543, 'learning_rate': 0.00018059999999999997, 'epoch': 0.59} 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 302/509 [46:26<27:13, 7.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 304/509 [46:48<33:10, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 304/509 [46:48<33:10, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4208, 'learning_rate': 0.00018119999999999999, 'epoch': 0.6} 60%|████████████████████████████████████████████████▍ | 304/509 [46:48<33:10, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 304/509 [46:48<33:10, 9.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 305/509 [47:00<34:46, 10.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 305/509 [47:00<34:46, 10.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2905, 'learning_rate': 0.00018179999999999997, 'epoch': 0.6} 60%|████████████████████████████████████████████████▌ | 305/509 [47:00<34:46, 10.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 305/509 [47:00<34:46, 10.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 305/509 [47:00<34:46, 10.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [47:11<35:39, 10.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [47:11<35:39, 10.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [47:11<35:39, 10.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [47:11<35:39, 10.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 306/509 [47:11<35:39, 10.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:22<36:16, 10.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:22<36:16, 10.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:22<36:16, 10.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:22<36:16, 10.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 307/509 [47:22<36:16, 10.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 308/509 [47:34<36:25, 10.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 308/509 [47:34<36:25, 10.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 308/509 [47:34<36:25, 10.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 308/509 [47:34<36:25, 10.87s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 309/509 [47:45<36:31, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 309/509 [47:45<36:31, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3185, 'learning_rate': 0.00018419999999999998, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▏ | 309/509 [47:45<36:31, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 309/509 [47:45<36:31, 10.96s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4775, 'learning_rate': 0.0001848, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3612, 'learning_rate': 0.00018539999999999998, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 310/509 [47:56<36:27, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:18<35:53, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:18<35:53, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.591, 'learning_rate': 0.000186, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▋ | 312/509 [48:18<35:53, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 312/509 [48:18<35:53, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:28<35:40, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:28<35:40, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4811, 'learning_rate': 0.00018659999999999998, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▊ | 313/509 [48:28<35:40, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 313/509 [48:28<35:40, 10.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:39<35:21, 10.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:39<35:21, 10.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.369, 'learning_rate': 0.0001872, 'epoch': 0.62} 62%|█████████████████████████████████████████████████▉ | 314/509 [48:39<35:21, 10.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 314/509 [48:39<35:21, 10.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:50<35:02, 10.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:50<35:02, 10.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.611, 'learning_rate': 0.00018779999999999998, 'epoch': 0.62} 62%|██████████████████████████████████████████████████▏ | 315/509 [48:50<35:02, 10.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:50<35:02, 10.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 315/509 [48:50<35:02, 10.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [49:01<34:34, 10.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [49:01<34:34, 10.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [49:01<34:34, 10.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [49:01<34:34, 10.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 316/509 [49:01<34:34, 10.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [49:11<34:16, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [49:11<34:16, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [49:11<34:16, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [49:11<34:16, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 317/509 [49:11<34:16, 10.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:22<33:57, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:22<33:57, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:22<33:57, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:22<33:57, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 318/509 [49:22<33:57, 10.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:32<33:35, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:32<33:35, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:32<33:35, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 319/509 [49:32<33:35, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:43<33:13, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:43<33:13, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.79, 'learning_rate': 0.00019079999999999998, 'epoch': 0.63} 63%|██████████████████████████████████████████████████▉ | 320/509 [49:43<33:13, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:43<33:13, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 320/509 [49:43<33:13, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:53<32:46, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:53<32:46, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:53<32:46, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:53<32:46, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 321/509 [49:53<32:46, 10.46s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [50:03<32:24, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [50:03<32:24, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [50:03<32:24, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [50:03<32:24, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 322/509 [50:03<32:24, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [50:13<32:07, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [50:13<32:07, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [50:13<32:07, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 323/509 [50:13<32:07, 10.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 324/509 [50:24<31:43, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 324/509 [50:24<31:43, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7002, 'learning_rate': 0.00019319999999999998, 'epoch': 0.64} 64%|███████████████████████████████████████████████████▌ | 324/509 [50:24<31:43, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 324/509 [50:24<31:43, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 325/509 [50:34<31:51, 10.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 325/509 [50:34<31:51, 10.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6622, 'learning_rate': 0.0001938, 'epoch': 0.64} 64%|███████████████████████████████████████████████████▋ | 325/509 [50:34<31:51, 10.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 325/509 [50:34<31:51, 10.39s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:44<31:24, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:44<31:24, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.502, 'learning_rate': 0.00019439999999999998, 'epoch': 0.64} 64%|███████████████████████████████████████████████████▉ | 326/509 [50:44<31:24, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:44<31:24, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 326/509 [50:44<31:24, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:54<30:49, 10.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:54<30:49, 10.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:54<30:49, 10.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:54<30:49, 10.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 327/509 [50:54<30:49, 10.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [51:04<30:11, 10.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [51:04<30:11, 10.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [51:04<30:11, 10.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [51:04<30:11, 10.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 328/509 [51:04<30:11, 10.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 329/509 [51:14<29:50, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 329/509 [51:14<29:50, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 329/509 [51:14<29:50, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 329/509 [51:14<29:50, 9.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:23<29:21, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:23<29:21, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5706, 'learning_rate': 0.00019679999999999999, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▌ | 330/509 [51:23<29:21, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:23<29:21, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 330/509 [51:23<29:21, 9.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:33<28:52, 9.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:33<28:52, 9.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:33<28:52, 9.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 331/509 [51:33<28:52, 9.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:42<28:25, 9.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:42<28:25, 9.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4544, 'learning_rate': 0.000198, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▊ | 332/509 [51:42<28:25, 9.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 332/509 [51:42<28:25, 9.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 333/509 [51:51<27:54, 9.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 333/509 [51:51<27:54, 9.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.489, 'learning_rate': 0.0001986, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▉ | 333/509 [51:51<27:54, 9.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 333/509 [51:51<27:54, 9.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [52:00<27:20, 9.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [52:00<27:20, 9.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4184, 'learning_rate': 0.0001992, 'epoch': 0.66} 66%|█████████████████████████████████████████████████████▏ | 334/509 [52:00<27:20, 9.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [52:00<27:20, 9.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 334/509 [52:00<27:20, 9.37s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [52:09<26:47, 9.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [52:09<26:47, 9.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [52:09<26:47, 9.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [52:09<26:47, 9.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 335/509 [52:09<26:47, 9.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:18<26:20, 9.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:18<26:20, 9.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:18<26:20, 9.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:18<26:20, 9.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 336/509 [52:18<26:20, 9.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:27<25:48, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:27<25:48, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:27<25:48, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:27<25:48, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 337/509 [52:27<25:48, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:35<25:15, 8.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:35<25:15, 8.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:35<25:15, 8.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:35<25:15, 8.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 338/509 [52:35<25:15, 8.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:43<24:29, 8.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:43<24:29, 8.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:43<24:29, 8.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:43<24:29, 8.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 339/509 [52:43<24:29, 8.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:51<23:43, 8.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:51<23:43, 8.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:51<23:43, 8.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:51<23:43, 8.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 340/509 [52:51<23:43, 8.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▎ | 341/509 [52:59<22:41, 8.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:08,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:08,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:08,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▍ | 342/509 [53:06<21:32, 7.74s/it]g-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▍ | 342/509 [53:06<21:32, 7.74s/it]g-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:17,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:17,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 343/509 [53:12<20:12, 7.30s/it]g-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 343/509 [53:12<20:12, 7.30s/it]g-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:23,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:23,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 344/509 [53:18<18:48, 6.84s/it]g-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:27,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:29,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:29,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8199, 'learning_rate': 0.0002058, 'epoch': 0.68} [WARNING|modeling_utils.py:388] 2022-03-02 23:19:33,132 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:33,132 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:12:11,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 346/509 [53:28<15:52, 5.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:35,374 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:37,382 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:19:35,374 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:37,382 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:19:35,374 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▏ | 347/509 [53:32<14:22, 5.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:39,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:41,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:19:39,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:41,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:19:39,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▍ | 348/509 [53:35<12:54, 4.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:42,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▍ | 348/509 [53:35<12:54, 4.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:42,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 349/509 [53:38<11:28, 4.30s/it]g-point operations will not be computed-02 23:19:42,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:47,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:19:45,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:19:47,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:19:45,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:42<10:26, 3.94s/it]g-point operations will not be computed-02 23:19:45,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:42<10:26, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:42<10:26, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:42<10:26, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 350/509 [53:42<10:26, 3.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:54<17:08, 6.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:54<17:08, 6.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:54<17:08, 6.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:54<17:08, 6.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 351/509 [53:54<17:08, 6.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [54:06<21:08, 8.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [54:06<21:08, 8.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [54:06<21:08, 8.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [54:06<21:08, 8.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 352/509 [54:06<21:08, 8.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:17<23:39, 9.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:17<23:39, 9.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:17<23:39, 9.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:17<23:39, 9.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 353/509 [54:17<23:39, 9.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:29<25:17, 9.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:29<25:17, 9.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:29<25:17, 9.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 354/509 [54:29<25:17, 9.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:40<26:28, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:40<26:28, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5034, 'learning_rate': 0.00021179999999999997, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:40<26:28, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:40<26:28, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 355/509 [54:40<26:28, 10.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:51<27:03, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:51<27:03, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:51<27:03, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 356/509 [54:51<27:03, 10.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [55:03<27:24, 10.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [55:03<27:24, 10.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6016, 'learning_rate': 0.00021299999999999997, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▊ | 357/509 [55:03<27:24, 10.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [55:03<27:24, 10.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 357/509 [55:03<27:24, 10.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:14<27:33, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:14<27:33, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3612, 'learning_rate': 0.00021359999999999996, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:14<27:33, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▉ | 358/509 [55:14<27:33, 10.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:25<27:26, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:25<27:26, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6466, 'learning_rate': 0.00021419999999999998, 'epoch': 0.7} 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:25<27:26, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 359/509 [55:25<27:26, 10.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:36<27:26, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:36<27:26, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6199, 'learning_rate': 0.00021479999999999996, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:36<27:26, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 360/509 [55:36<27:26, 11.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:47<27:17, 11.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:47<27:17, 11.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3133, 'learning_rate': 0.00021539999999999998, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:47<27:17, 11.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 361/509 [55:47<27:17, 11.07s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:58<27:00, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:58<27:00, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5717, 'learning_rate': 0.00021599999999999996, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:58<27:00, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 362/509 [55:58<27:00, 11.03s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [56:09<26:44, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [56:09<26:44, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7201, 'learning_rate': 0.00021659999999999998, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▊ | 363/509 [56:09<26:44, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 363/509 [56:09<26:44, 10.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:20<26:24, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:20<26:24, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2984, 'learning_rate': 0.00021719999999999997, 'epoch': 0.71} 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:20<26:24, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 364/509 [56:20<26:24, 10.93s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6171, 'learning_rate': 0.00021779999999999998, 'epoch': 0.72} 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4188, 'learning_rate': 0.00021839999999999997, 'epoch': 0.72} 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 365/509 [56:31<26:07, 10.89s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:52<25:33, 10.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:52<25:33, 10.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:52<25:33, 10.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:52<25:33, 10.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 367/509 [56:52<25:33, 10.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [57:03<25:12, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [57:03<25:12, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [57:03<25:12, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [57:03<25:12, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 368/509 [57:03<25:12, 10.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [57:13<24:53, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [57:13<24:53, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [57:13<24:53, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [57:13<24:53, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 369/509 [57:13<24:53, 10.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 370/509 [57:24<24:33, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 370/509 [57:24<24:33, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 370/509 [57:24<24:33, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 370/509 [57:24<24:33, 10.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 371/509 [57:34<24:16, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 371/509 [57:34<24:16, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4769, 'learning_rate': 0.0002214, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████ | 371/509 [57:34<24:16, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 371/509 [57:34<24:16, 10.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:44<23:53, 10.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:44<23:53, 10.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2707, 'learning_rate': 0.00022199999999999998, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:44<23:53, 10.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 372/509 [57:44<23:53, 10.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:55<23:34, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:55<23:34, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6927, 'learning_rate': 0.0002226, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:55<23:34, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 373/509 [57:55<23:34, 10.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▌ | 374/509 [58:05<23:11, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▌ | 374/509 [58:05<23:11, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4139, 'learning_rate': 0.00022319999999999998, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████▌ | 374/509 [58:05<23:11, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▌ | 374/509 [58:05<23:11, 10.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:15<23:14, 10.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:15<23:14, 10.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5554, 'learning_rate': 0.0002238, 'epoch': 0.74} 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:15<23:14, 10.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 375/509 [58:15<23:14, 10.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▊ | 376/509 [58:25<22:48, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▊ | 376/509 [58:25<22:48, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5681, 'learning_rate': 0.00022439999999999998, 'epoch': 0.74} 74%|███████████████████████████████████████████████████████████▊ | 376/509 [58:25<22:48, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▊ | 376/509 [58:25<22:48, 10.29s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:35<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:35<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4412, 'learning_rate': 0.000225, 'epoch': 0.74} 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:35<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:35<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 377/509 [58:35<22:23, 10.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:46<22:27, 10.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:46<22:27, 10.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:46<22:27, 10.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:46<22:27, 10.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 378/509 [58:46<22:27, 10.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:56<21:53, 10.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:56<21:53, 10.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:56<21:53, 10.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 379/509 [58:56<21:53, 10.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 380/509 [59:05<21:28, 9.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 380/509 [59:05<21:28, 9.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5773, 'learning_rate': 0.00022679999999999998, 'epoch': 0.75} 75%|████████████████████████████████████████████████████████████▍ | 380/509 [59:05<21:28, 9.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 380/509 [59:05<21:28, 9.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 380/509 [59:05<21:28, 9.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [59:15<21:01, 9.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [59:15<21:01, 9.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [59:15<21:01, 9.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [59:15<21:01, 9.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 381/509 [59:15<21:01, 9.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:24<20:40, 9.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:24<20:40, 9.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:24<20:40, 9.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 382/509 [59:24<20:40, 9.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:34<20:17, 9.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:34<20:17, 9.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7158, 'learning_rate': 0.00022859999999999997, 'epoch': 0.75} 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:34<20:17, 9.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:34<20:17, 9.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 383/509 [59:34<20:17, 9.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:43<19:56, 9.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:43<19:56, 9.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:43<19:56, 9.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████████ | 384/509 [59:43<19:56, 9.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:52<19:36, 9.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:52<19:36, 9.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6043, 'learning_rate': 0.00022979999999999997, 'epoch': 0.76} 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:52<19:36, 9.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:52<19:36, 9.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▎ | 385/509 [59:52<19:36, 9.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▉ | 386/509 [1:00:01<19:05, 9.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▉ | 386/509 [1:00:01<19:05, 9.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▉ | 386/509 [1:00:01<19:05, 9.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▉ | 386/509 [1:00:01<19:05, 9.32s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████ | 387/509 [1:00:10<18:38, 9.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████ | 387/509 [1:00:10<18:38, 9.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4141, 'learning_rate': 0.00023099999999999998, 'epoch': 0.76} 76%|████████████████████████████████████████████████████████████ | 387/509 [1:00:10<18:38, 9.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████ | 387/509 [1:00:10<18:38, 9.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:19<18:08, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:19<18:08, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4335, 'learning_rate': 0.0002316, 'epoch': 0.76} 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:19<18:08, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 388/509 [1:00:19<18:08, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:27<17:37, 8.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:27<17:37, 8.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.469, 'learning_rate': 0.00023219999999999998, 'epoch': 0.76} 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:27<17:37, 8.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 389/509 [1:00:27<17:37, 8.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:35<17:00, 8.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:35<17:00, 8.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5875, 'learning_rate': 0.0002328, 'epoch': 0.77} 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:35<17:00, 8.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▌ | 390/509 [1:00:35<17:00, 8.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:43<16:19, 8.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:43<16:19, 8.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3555, 'learning_rate': 0.00023339999999999998, 'epoch': 0.77} 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:43<16:19, 8.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:43<16:19, 8.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 391/509 [1:00:43<16:19, 8.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:19:51,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 392/509 [1:00:50<15:32, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 392/509 [1:00:50<15:32, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 392/509 [1:00:50<15:32, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 393/509 [1:00:57<14:35, 7.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 393/509 [1:00:57<14:35, 7.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:06,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:06,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 394/509 [1:01:03<13:43, 7.16s/it]g-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 394/509 [1:01:03<13:43, 7.16s/it]g-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:12,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:15,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:15,089 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5762, 'learning_rate': 0.00023579999999999999, 'epoch': 0.77} [WARNING|modeling_utils.py:388] 2022-03-02 23:27:18,801 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:26:58,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 396/509 [1:01:13<11:31, 6.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 396/509 [1:01:13<11:31, 6.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.475, 'learning_rate': 0.0002364, 'epoch': 0.78} [WARNING|modeling_utils.py:388] 2022-03-02 23:27:24,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:24,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:26,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:28,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:28,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:29,999 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:29,999 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:32,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:34,530 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:34,530 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4873, 'learning_rate': 0.0002388, 'epoch': 0.78} [WARNING|modeling_utils.py:388] 2022-03-02 23:27:40,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:27:40,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:40<11:54, 6.62s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:40<11:54, 6.62s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4953, 'learning_rate': 0.0002394, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:40<11:54, 6.62s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:40<11:54, 6.62s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 401/509 [1:01:40<11:54, 6.62s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:52<14:33, 8.17s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:52<14:33, 8.17s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:52<14:33, 8.17s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 402/509 [1:01:52<14:33, 8.17s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:02:04<16:16, 9.21s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:02:04<16:16, 9.21s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3974, 'learning_rate': 0.0002406, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:02:04<16:16, 9.21s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:02:04<16:16, 9.21s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 403/509 [1:02:04<16:16, 9.21s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:02:15<17:23, 9.94s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:02:15<17:23, 9.94s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:02:15<17:23, 9.94s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:02:15<17:23, 9.94s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 404/509 [1:02:15<17:23, 9.94s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6551, 'learning_rate': 0.00024239999999999998, 'epoch': 0.8} 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▊ | 405/509 [1:02:27<18:05, 10.43s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 407/509 [1:02:50<18:31, 10.89s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 407/509 [1:02:50<18:31, 10.89s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6615, 'learning_rate': 0.000243, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▏ | 407/509 [1:02:50<18:31, 10.89s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 407/509 [1:02:50<18:31, 10.89s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:03:01<18:30, 10.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:03:01<18:30, 10.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5641, 'learning_rate': 0.00024359999999999999, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:03:01<18:30, 10.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 408/509 [1:03:01<18:30, 10.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:03:12<18:25, 11.05s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:03:12<18:25, 11.05s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5003, 'learning_rate': 0.00024419999999999997, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:03:12<18:25, 11.05s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 409/509 [1:03:12<18:25, 11.05s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:23<18:15, 11.07s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:23<18:15, 11.07s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5603, 'learning_rate': 0.0002448, 'epoch': 0.8} 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:23<18:15, 11.07s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:23<18:15, 11.07s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 410/509 [1:03:23<18:15, 11.07s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:34<17:59, 11.01s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:34<17:59, 11.01s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:34<17:59, 11.01s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:34<17:59, 11.01s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 411/509 [1:03:34<17:59, 11.01s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:45<17:44, 10.98s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:45<17:44, 10.98s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:45<17:44, 10.98s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:45<17:44, 10.98s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 412/509 [1:03:45<17:44, 10.98s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:56<17:28, 10.93s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:56<17:28, 10.93s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:56<17:28, 10.93s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 413/509 [1:03:56<17:28, 10.93s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:04:07<17:12, 10.86s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:04:07<17:12, 10.86s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4854, 'learning_rate': 0.0002472, 'epoch': 0.81} 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:04:07<17:12, 10.86s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:04:07<17:12, 10.86s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 414/509 [1:04:07<17:12, 10.86s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:04:17<16:56, 10.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:04:17<16:56, 10.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:04:17<16:56, 10.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:04:17<16:56, 10.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▍ | 415/509 [1:04:17<16:56, 10.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:28<16:42, 10.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:28<16:42, 10.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:28<16:42, 10.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:28<16:42, 10.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 416/509 [1:04:28<16:42, 10.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:39<16:27, 10.74s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:39<16:27, 10.74s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:39<16:27, 10.74s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 417/509 [1:04:39<16:27, 10.74s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5617, 'learning_rate': 0.00024959999999999994, 'epoch': 0.82} 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8485, 'learning_rate': 0.00025019999999999996, 'epoch': 0.82} 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 418/509 [1:04:49<16:12, 10.69s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4427, 'learning_rate': 0.00025079999999999997, 'epoch': 0.82} 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3531, 'learning_rate': 0.0002514, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▏ | 420/509 [1:05:10<15:41, 10.58s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:31<15:10, 10.47s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:31<15:10, 10.47s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4535, 'learning_rate': 0.00025199999999999995, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:31<15:10, 10.47s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 422/509 [1:05:31<15:10, 10.47s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:41<14:54, 10.41s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:41<14:54, 10.41s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5116, 'learning_rate': 0.00025259999999999996, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:41<14:54, 10.41s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 423/509 [1:05:41<14:54, 10.41s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2711, 'learning_rate': 0.0002532, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4944, 'learning_rate': 0.0002538, 'epoch': 0.83} 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 424/509 [1:05:51<14:35, 10.30s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:06:12<14:20, 10.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:06:12<14:20, 10.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4421, 'learning_rate': 0.00025439999999999995, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:06:12<14:20, 10.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████ | 426/509 [1:06:12<14:20, 10.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▎ | 427/509 [1:06:22<13:58, 10.22s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▎ | 427/509 [1:06:22<13:58, 10.22s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4307, 'learning_rate': 0.00025499999999999996, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████▎ | 427/509 [1:06:22<13:58, 10.22s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▎ | 427/509 [1:06:22<13:58, 10.22s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4475, 'learning_rate': 0.0002556, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4769, 'learning_rate': 0.0002562, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 428/509 [1:06:32<13:36, 10.08s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:51<12:59, 9.87s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:51<12:59, 9.87s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4074, 'learning_rate': 0.00025679999999999995, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:51<12:59, 9.87s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:51<12:59, 9.87s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▋ | 430/509 [1:06:51<12:59, 9.87s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:07:01<12:42, 9.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:07:01<12:42, 9.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:07:01<12:42, 9.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:07:01<12:42, 9.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▉ | 431/509 [1:07:01<12:42, 9.78s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:07:10<12:27, 9.71s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:07:10<12:27, 9.71s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:07:10<12:27, 9.71s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 432/509 [1:07:10<12:27, 9.71s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:20<12:09, 9.60s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:20<12:09, 9.60s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4432, 'learning_rate': 0.0002586, 'epoch': 0.85} 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:20<12:09, 9.60s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▏ | 433/509 [1:07:20<12:09, 9.60s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 434/509 [1:07:29<11:48, 9.45s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 434/509 [1:07:29<11:48, 9.45s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5023, 'learning_rate': 0.00025919999999999996, 'epoch': 0.85} 85%|███████████████████████████████████████████████████████████████████▎ | 434/509 [1:07:29<11:48, 9.45s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 434/509 [1:07:29<11:48, 9.45s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▌ | 435/509 [1:07:38<11:29, 9.32s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▌ | 435/509 [1:07:38<11:29, 9.32s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4265, 'learning_rate': 0.00025979999999999997, 'epoch': 0.85} 85%|███████████████████████████████████████████████████████████████████▌ | 435/509 [1:07:38<11:29, 9.32s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▌ | 435/509 [1:07:38<11:29, 9.32s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:46<11:08, 9.16s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:46<11:08, 9.16s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3814, 'learning_rate': 0.0002604, 'epoch': 0.86} 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:46<11:08, 9.16s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 436/509 [1:07:46<11:08, 9.16s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:55<10:47, 8.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:55<10:47, 8.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7153, 'learning_rate': 0.000261, 'epoch': 0.86} 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:55<10:47, 8.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:55<10:47, 8.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 437/509 [1:07:55<10:47, 8.99s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:08:03<10:25, 8.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:08:03<10:25, 8.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:08:03<10:25, 8.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:08:03<10:25, 8.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 438/509 [1:08:03<10:25, 8.81s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:08:12<10:02, 8.61s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:08:12<10:02, 8.61s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:08:12<10:02, 8.61s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:08:12<10:02, 8.61s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 439/509 [1:08:12<10:02, 8.61s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:19<09:37, 8.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:19<09:37, 8.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:19<09:37, 8.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:19<09:37, 8.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▎ | 440/509 [1:08:19<09:37, 8.37s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 441/509 [1:08:27<09:10, 8.10s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 441/509 [1:08:27<09:10, 8.10s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:38,914 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:38,914 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:34<08:42, 7.79s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:34<08:42, 7.79s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▌ | 442/509 [1:08:34<08:42, 7.79s/it]g-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:47,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:47,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2913, 'learning_rate': 0.0002646, 'epoch': 0.87} [WARNING|modeling_utils.py:388] 2022-03-02 23:34:47,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:47,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:47,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:27:21,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 444/509 [1:08:47<07:42, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:34:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 444/509 [1:08:47<07:42, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:34:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:59,199 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:34:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:34:59,199 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:34:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6385, 'learning_rate': 0.00026579999999999996, 'epoch': 0.87} [WARNING|modeling_utils.py:388] 2022-03-02 23:35:02,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:34:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:35:02,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:34:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 446/509 [1:08:57<06:26, 6.14s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:35:05,335 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:35:07,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:35:05,335 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:35:07,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:35:05,335 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 447/509 [1:09:02<05:45, 5.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:35:09,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:35:11,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:35:09,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:35:11,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:35:09,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▌ | 448/509 [1:09:06<05:07, 5.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 23:35:13,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 449/509 [1:09:09<04:30, 4.51s/it]g-point operations will not be computed-02 23:35:13,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 449/509 [1:09:09<04:30, 4.51s/it]g-point operations will not be computed-02 23:35:13,155 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:35:17,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:35:16,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 23:35:17,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 23:35:16,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▊ | 450/509 [1:09:12<04:04, 4.15s/it]g-point operations will not be computed-02 23:35:16,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)2<04:04, 4.15s/it]Traceback (most recent call last):puted-02 23:35:16,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)2<04:04, 4.15s/it]Traceback (most recent call last):puted-02 23:35:16,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)2<04:04, 4.15s/it]Traceback (most recent call last):puted-02 23:35:16,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed