0%| | 0/594 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:16:31,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:16:33,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8225, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 02:16:36,332 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▏ | 1/594 [00:11<1:49:58, 11.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:16:38,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:16:41,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:16:44,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0072, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-03-02 02:16:46,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 2/594 [00:21<1:45:58, 10.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:16:49,354 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:16:51,880 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:16:54,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8898, 'learning_rate': 2.0000000000000002e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 02:16:56,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▍ | 3/594 [00:31<1:43:40, 10.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:16:59,660 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:02,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:04,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.839, 'learning_rate': 4.0000000000000003e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 02:17:07,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 4/594 [00:41<1:41:43, 10.34s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:17:09,679 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:12,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:14,657 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7915, 'learning_rate': 6.000000000000001e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 02:17:17,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▋ | 5/594 [00:52<1:40:37, 10.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:17:19,747 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:22,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:24,715 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8052, 'learning_rate': 8.000000000000001e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-02 02:17:27,180 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 6/594 [01:02<1:39:39, 10.17s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:17:29,775 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:32,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:34,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:37,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8753, 'learning_rate': 1.0000000000000002e-06, 'epoch': 0.01} 1%|▉ | 7/594 [01:11<1:38:37, 10.08s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:17:39,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:42,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:44,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:47,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 8/594 [01:21<1:38:05, 10.04s/it] 1%|█ | 8/594 [01:21<1:38:05, 10.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:17:49,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:52,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:54,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:17:56,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▏ | 9/594 [01:31<1:37:34, 10.01s/it] 2%|█▏ | 9/594 [01:31<1:37:34, 10.01s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:17:59,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:01,963 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:04,440 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:06,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 10/594 [01:41<1:37:04, 9.97s/it] 2%|█▎ | 10/594 [01:41<1:37:04, 9.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:18:09,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:11,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:14,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7885, 'learning_rate': 1.8e-06, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-02 02:18:16,548 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▍ | 11/594 [01:51<1:36:01, 9.88s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:18:19,060 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:21,440 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:23,860 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:26,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 12/594 [02:01<1:35:13, 9.82s/it] 2%|█▌ | 12/594 [02:01<1:35:13, 9.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:18:28,700 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:31,157 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:33,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:35,853 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 13/594 [02:10<1:34:32, 9.76s/it] 2%|█▊ | 13/594 [02:10<1:34:32, 9.76s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:18:38,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:40,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:43,005 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6684, 'learning_rate': 2.4000000000000003e-06, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-02 02:18:45,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 14/594 [02:20<1:33:43, 9.70s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:18:47,824 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:50,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:52,519 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5584, 'learning_rate': 2.6e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 02:18:54,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██ | 15/594 [02:29<1:33:03, 9.64s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:18:57,358 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:18:59,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:02,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7161, 'learning_rate': 2.8000000000000003e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 02:19:04,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 16/594 [02:39<1:32:06, 9.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:19:06,700 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:08,986 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:11,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:13,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 17/594 [02:48<1:31:20, 9.50s/it] 3%|██▎ | 17/594 [02:48<1:31:20, 9.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:19:16,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:18,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:20,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7672, 'learning_rate': 3.2000000000000003e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 02:19:22,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 18/594 [02:57<1:30:35, 9.44s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:19:25,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:27,615 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:29,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5319, 'learning_rate': 3.4000000000000005e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 02:19:32,126 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 19/594 [03:06<1:29:44, 9.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:19:34,486 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:36,708 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:38,934 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7361, 'learning_rate': 3.6e-06, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-02 02:19:41,169 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 20/594 [03:16<1:28:39, 9.27s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:19:43,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:45,724 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:47,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6663, 'learning_rate': 3.8e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 02:19:50,166 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 21/594 [03:25<1:27:44, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:19:52,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:54,663 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:56,886 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:19:59,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 22/594 [03:33<1:26:43, 9.10s/it] 4%|██▉ | 22/594 [03:33<1:26:43, 9.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:01,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:03,610 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:05,823 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:08,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 23/594 [03:42<1:26:11, 9.06s/it] 4%|███ | 23/594 [03:42<1:26:11, 9.06s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:10,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:12,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:14,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:16,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 24/594 [03:51<1:25:30, 9.00s/it] 4%|███▏ | 24/594 [03:51<1:25:30, 9.00s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:19,192 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:21,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:23,522 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:26,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4311, 'learning_rate': 4.6e-06, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-02 02:20:30,653 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:20:28,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:30,653 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:20:28,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:32,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:34,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▌ | 26/594 [04:09<1:25:00, 8.98s/it] 4%|███▌ | 26/594 [04:09<1:25:00, 8.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:37,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▌ | 26/594 [04:09<1:25:00, 8.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:37,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:41,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:20:37,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 27/594 [04:18<1:24:16, 8.92s/it]g-point operations will not be computed-02 02:20:37,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 27/594 [04:18<1:24:16, 8.92s/it]g-point operations will not be computed-02 02:20:37,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 27/594 [04:18<1:24:16, 8.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:45,866 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 27/594 [04:18<1:24:16, 8.92s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:45,866 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:50,141 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:20:45,866 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 28/594 [04:27<1:23:13, 8.82s/it]g-point operations will not be computed-02 02:20:45,866 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 28/594 [04:27<1:23:13, 8.82s/it]g-point operations will not be computed-02 02:20:45,866 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 28/594 [04:27<1:23:13, 8.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:54,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 28/594 [04:27<1:23:13, 8.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:20:54,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:58,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:20:54,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:20:58,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:20:54,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 29/594 [04:35<1:22:13, 8.73s/it]g-point operations will not be computed-02 02:20:54,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 29/594 [04:35<1:22:13, 8.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:03,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 29/594 [04:35<1:22:13, 8.73s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:03,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:07,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:03,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:07,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:03,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 30/594 [04:44<1:21:24, 8.66s/it]g-point operations will not be computed-02 02:21:03,016 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 30/594 [04:44<1:21:24, 8.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:11,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 30/594 [04:44<1:21:24, 8.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:11,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:15,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:11,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:15,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:11,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 31/594 [04:52<1:20:30, 8.58s/it]g-point operations will not be computed-02 02:21:11,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 31/594 [04:52<1:20:30, 8.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:19,886 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 31/594 [04:52<1:20:30, 8.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:19,886 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:23,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:19,886 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:23,930 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:19,886 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 32/594 [05:00<1:19:27, 8.48s/it]g-point operations will not be computed-02 02:21:19,886 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 32/594 [05:00<1:19:27, 8.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:28,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 32/594 [05:00<1:19:27, 8.48s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:28,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:32,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:28,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:32,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:28,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 33/594 [05:09<1:18:41, 8.42s/it]g-point operations will not be computed-02 02:21:28,118 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 33/594 [05:09<1:18:41, 8.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:36,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 33/594 [05:09<1:18:41, 8.42s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:36,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:40,200 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:36,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:40,200 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:36,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 34/594 [05:17<1:17:14, 8.28s/it]g-point operations will not be computed-02 02:21:36,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 34/594 [05:17<1:17:14, 8.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:44,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 34/594 [05:17<1:17:14, 8.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:44,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:48,054 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:44,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:48,054 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:44,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 35/594 [05:24<1:15:46, 8.13s/it]g-point operations will not be computed-02 02:21:44,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 35/594 [05:24<1:15:46, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:51,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 35/594 [05:24<1:15:46, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:51,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:55,745 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:51,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:21:55,745 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:51,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 36/594 [05:32<1:14:15, 7.98s/it]g-point operations will not be computed-02 02:21:51,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 36/594 [05:32<1:14:15, 7.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:59,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 36/594 [05:32<1:14:15, 7.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:21:59,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:03,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:59,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:03,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:21:59,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 37/594 [05:39<1:12:37, 7.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:06,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 37/594 [05:39<1:12:37, 7.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:06,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:10,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:06,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:10,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:06,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 38/594 [05:47<1:10:51, 7.65s/it]g-point operations will not be computed-02 02:22:06,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 38/594 [05:47<1:10:51, 7.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:14,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 38/594 [05:47<1:10:51, 7.65s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:14,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 39/594 [05:54<1:09:05, 7.47s/it]g-point operations will not be computed-02 02:22:14,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 39/594 [05:54<1:09:05, 7.47s/it]g-point operations will not be computed-02 02:22:14,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 39/594 [05:54<1:09:05, 7.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:21,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 39/594 [05:54<1:09:05, 7.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:21,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:24,578 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:21,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:24,578 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:21,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 40/594 [06:01<1:07:26, 7.30s/it]g-point operations will not be computed-02 02:22:21,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 40/594 [06:01<1:07:26, 7.30s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:27,966 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:31,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:27,966 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:31,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:27,966 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 41/594 [06:07<1:04:50, 7.04s/it]g-point operations will not be computed-02 02:22:27,966 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 41/594 [06:07<1:04:50, 7.04s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:34,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:37,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:34,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:37,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:34,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 42/594 [06:13<1:02:25, 6.79s/it]g-point operations will not be computed-02 02:22:34,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 42/594 [06:13<1:02:25, 6.79s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:40,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:43,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:40,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:43,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:40,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 43/594 [06:19<59:26, 6.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:46,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 43/594 [06:19<59:26, 6.47s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:46,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|██████ | 44/594 [06:24<55:47, 6.09s/it]g-point operations will not be computed-02 02:22:46,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|██████ | 44/594 [06:24<55:47, 6.09s/it]g-point operations will not be computed-02 02:22:46,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|██████ | 44/594 [06:24<55:47, 6.09s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:51,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:53,363 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:51,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:53,363 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:51,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 45/594 [06:29<51:46, 5.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:55,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:57,632 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:55,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:22:57,632 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:55,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 46/594 [06:33<47:31, 5.20s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:22:59,612 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:01,415 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:59,612 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:01,415 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:22:59,612 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 47/594 [06:37<43:18, 4.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:03,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 47/594 [06:37<43:18, 4.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:03,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 48/594 [06:40<39:02, 4.29s/it]g-point operations will not be computed-02 02:23:03,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:07,651 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:06,309 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:07,651 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:06,309 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:10,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:09,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:10,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:09,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 50/594 [06:46<32:25, 3.58s/it]g-point operations will not be computed-02 02:23:09,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 50/594 [06:46<32:25, 3.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:14,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 50/594 [06:46<32:25, 3.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:14,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:19,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:14,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 51/594 [06:56<51:55, 5.74s/it]g-point operations will not be computed-02 02:23:14,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 51/594 [06:56<51:55, 5.74s/it]g-point operations will not be computed-02 02:23:14,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 51/594 [06:56<51:55, 5.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:24,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 51/594 [06:56<51:55, 5.74s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:24,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:29,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:24,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 52/594 [07:07<1:04:07, 7.10s/it]g-point operations will not be computed-02 02:23:24,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 52/594 [07:07<1:04:07, 7.10s/it]g-point operations will not be computed-02 02:23:24,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 52/594 [07:07<1:04:07, 7.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:34,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 52/594 [07:07<1:04:07, 7.10s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:34,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:40,037 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:34,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 53/594 [07:17<1:12:35, 8.05s/it]g-point operations will not be computed-02 02:23:34,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 53/594 [07:17<1:12:35, 8.05s/it]g-point operations will not be computed-02 02:23:34,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 53/594 [07:17<1:12:35, 8.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 53/594 [07:17<1:12:35, 8.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:23:50,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 54/594 [07:27<1:17:59, 8.67s/it]g-point operations will not be computed-02 02:23:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 54/594 [07:27<1:17:59, 8.67s/it]g-point operations will not be computed-02 02:23:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 54/594 [07:27<1:17:59, 8.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:55,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 54/594 [07:27<1:17:59, 8.67s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:23:55,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:00,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:55,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:00,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:23:55,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 55/594 [07:37<1:21:17, 9.05s/it]g-point operations will not be computed-02 02:23:55,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 55/594 [07:37<1:21:17, 9.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:05,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 55/594 [07:37<1:21:17, 9.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:05,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:10,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:05,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:10,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:05,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:10,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:05,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 56/594 [07:47<1:23:38, 9.33s/it]g-point operations will not be computed-02 02:24:05,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 56/594 [07:47<1:23:38, 9.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:15,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▌ | 56/594 [07:47<1:23:38, 9.33s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:15,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:20,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:15,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▋ | 57/594 [07:57<1:25:01, 9.50s/it]g-point operations will not be computed-02 02:24:15,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▋ | 57/594 [07:57<1:25:01, 9.50s/it]g-point operations will not be computed-02 02:24:15,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▋ | 57/594 [07:57<1:25:01, 9.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:25,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▋ | 57/594 [07:57<1:25:01, 9.50s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:25,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:29,948 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:25,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 58/594 [08:07<1:25:53, 9.61s/it]g-point operations will not be computed-02 02:24:25,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 58/594 [08:07<1:25:53, 9.61s/it]g-point operations will not be computed-02 02:24:25,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 58/594 [08:07<1:25:53, 9.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:34,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 58/594 [08:07<1:25:53, 9.61s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:34,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:39,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:34,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:39,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:34,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 59/594 [08:16<1:26:07, 9.66s/it]g-point operations will not be computed-02 02:24:34,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 59/594 [08:16<1:26:07, 9.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 59/594 [08:16<1:26:07, 9.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:49,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:49,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:24:49,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 60/594 [08:26<1:26:14, 9.69s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 60/594 [08:26<1:26:14, 9.69s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 60/594 [08:26<1:26:14, 9.69s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 60/594 [08:26<1:26:14, 9.69s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 61/594 [08:36<1:25:55, 9.67s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 61/594 [08:36<1:25:55, 9.67s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3305, 'learning_rate': 1.18e-05, 'epoch': 0.1} 10%|████████▏ | 61/594 [08:36<1:25:55, 9.67s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 61/594 [08:36<1:25:55, 9.67s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 62/594 [08:45<1:25:32, 9.65s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 62/594 [08:45<1:25:32, 9.65s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3873, 'learning_rate': 1.2e-05, 'epoch': 0.1} 10%|████████▎ | 62/594 [08:45<1:25:32, 9.65s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 62/594 [08:45<1:25:32, 9.65s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 63/594 [08:55<1:24:50, 9.59s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 63/594 [08:55<1:24:50, 9.59s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2663, 'learning_rate': 1.22e-05, 'epoch': 0.11} 11%|████████▍ | 63/594 [08:55<1:24:50, 9.59s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 63/594 [08:55<1:24:50, 9.59s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 64/594 [09:04<1:24:14, 9.54s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 64/594 [09:04<1:24:14, 9.54s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2255, 'learning_rate': 1.24e-05, 'epoch': 0.11} 11%|████████▌ | 64/594 [09:04<1:24:14, 9.54s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 64/594 [09:04<1:24:14, 9.54s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 64/594 [09:04<1:24:14, 9.54s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 65/594 [09:14<1:23:35, 9.48s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 65/594 [09:14<1:23:35, 9.48s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 65/594 [09:14<1:23:35, 9.48s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 65/594 [09:14<1:23:35, 9.48s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 65/594 [09:14<1:23:35, 9.48s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 66/594 [09:23<1:23:10, 9.45s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 66/594 [09:23<1:23:10, 9.45s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 66/594 [09:23<1:23:10, 9.45s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 66/594 [09:23<1:23:10, 9.45s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 67/594 [09:32<1:22:49, 9.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 67/594 [09:32<1:22:49, 9.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1834, 'learning_rate': 1.3000000000000001e-05, 'epoch': 0.11} 11%|█████████ | 67/594 [09:32<1:22:49, 9.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 67/594 [09:32<1:22:49, 9.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████ | 67/594 [09:32<1:22:49, 9.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 68/594 [09:42<1:22:23, 9.40s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 68/594 [09:42<1:22:23, 9.40s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 68/594 [09:42<1:22:23, 9.40s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 68/594 [09:42<1:22:23, 9.40s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 69/594 [09:51<1:21:45, 9.34s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 69/594 [09:51<1:21:45, 9.34s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2772, 'learning_rate': 1.3400000000000002e-05, 'epoch': 0.12} 12%|█████████▎ | 69/594 [09:51<1:21:45, 9.34s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 69/594 [09:51<1:21:45, 9.34s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 69/594 [09:51<1:21:45, 9.34s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3441, 'learning_rate': 1.3800000000000002e-05, 'epoch': 0.12} 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 70/594 [10:00<1:21:21, 9.32s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 72/594 [10:18<1:20:09, 9.21s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 72/594 [10:18<1:20:09, 9.21s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 72/594 [10:18<1:20:09, 9.21s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 72/594 [10:18<1:20:09, 9.21s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▊ | 73/594 [10:27<1:19:18, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▊ | 73/594 [10:27<1:19:18, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1117, 'learning_rate': 1.42e-05, 'epoch': 0.12} 12%|█████████▊ | 73/594 [10:27<1:19:18, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▊ | 73/594 [10:27<1:19:18, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 74/594 [10:36<1:18:25, 9.05s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 74/594 [10:36<1:18:25, 9.05s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3191, 'learning_rate': 1.44e-05, 'epoch': 0.12} 12%|█████████▉ | 74/594 [10:36<1:18:25, 9.05s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 74/594 [10:36<1:18:25, 9.05s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 75/594 [10:46<1:18:58, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 75/594 [10:46<1:18:58, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2973, 'learning_rate': 1.4599999999999999e-05, 'epoch': 0.13} 13%|██████████ | 75/594 [10:46<1:18:58, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 75/594 [10:46<1:18:58, 9.13s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 76/594 [10:54<1:17:58, 9.03s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 76/594 [10:54<1:17:58, 9.03s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.273, 'learning_rate': 1.48e-05, 'epoch': 0.13} 13%|██████████▏ | 76/594 [10:54<1:17:58, 9.03s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 76/594 [10:54<1:17:58, 9.03s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 76/594 [10:54<1:17:58, 9.03s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 77/594 [11:03<1:17:03, 8.94s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 77/594 [11:03<1:17:03, 8.94s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 77/594 [11:03<1:17:03, 8.94s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 77/594 [11:03<1:17:03, 8.94s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 78/594 [11:12<1:15:49, 8.82s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 78/594 [11:12<1:15:49, 8.82s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2333, 'learning_rate': 1.52e-05, 'epoch': 0.13} 13%|██████████▌ | 78/594 [11:12<1:15:49, 8.82s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 78/594 [11:12<1:15:49, 8.82s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 79/594 [11:20<1:14:44, 8.71s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 79/594 [11:20<1:14:44, 8.71s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1605, 'learning_rate': 1.54e-05, 'epoch': 0.13} 13%|██████████▋ | 79/594 [11:20<1:14:44, 8.71s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 79/594 [11:20<1:14:44, 8.71s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▊ | 80/594 [11:28<1:13:46, 8.61s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▊ | 80/594 [11:28<1:13:46, 8.61s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3374, 'learning_rate': 1.56e-05, 'epoch': 0.13} 13%|██████████▊ | 80/594 [11:28<1:13:46, 8.61s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▊ | 80/594 [11:28<1:13:46, 8.61s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▉ | 81/594 [11:37<1:12:54, 8.53s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▉ | 81/594 [11:37<1:12:54, 8.53s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2885, 'learning_rate': 1.58e-05, 'epoch': 0.14} 14%|██████████▉ | 81/594 [11:37<1:12:54, 8.53s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|██████████▉ | 81/594 [11:37<1:12:54, 8.53s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 82/594 [11:45<1:11:55, 8.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 82/594 [11:45<1:11:55, 8.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2615, 'learning_rate': 1.6000000000000003e-05, 'epoch': 0.14} 14%|███████████ | 82/594 [11:45<1:11:55, 8.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 82/594 [11:45<1:11:55, 8.43s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 83/594 [11:53<1:11:06, 8.35s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 83/594 [11:53<1:11:06, 8.35s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.342, 'learning_rate': 1.62e-05, 'epoch': 0.14} 14%|███████████▏ | 83/594 [11:53<1:11:06, 8.35s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 83/594 [11:53<1:11:06, 8.35s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 84/594 [12:01<1:09:52, 8.22s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 84/594 [12:01<1:09:52, 8.22s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3158, 'learning_rate': 1.6400000000000002e-05, 'epoch': 0.14} 14%|███████████▎ | 84/594 [12:01<1:09:52, 8.22s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 84/594 [12:01<1:09:52, 8.22s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 85/594 [12:09<1:08:48, 8.11s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 85/594 [12:09<1:08:48, 8.11s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2401, 'learning_rate': 1.66e-05, 'epoch': 0.14} 14%|███████████▍ | 85/594 [12:09<1:08:48, 8.11s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 85/594 [12:09<1:08:48, 8.11s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▌ | 86/594 [12:17<1:07:26, 7.97s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▌ | 86/594 [12:17<1:07:26, 7.97s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1713, 'learning_rate': 1.6800000000000002e-05, 'epoch': 0.14} 14%|███████████▌ | 86/594 [12:17<1:07:26, 7.97s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▌ | 86/594 [12:17<1:07:26, 7.97s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▌ | 86/594 [12:17<1:07:26, 7.97s/it]g-point operations will not be computed-02 02:24:44,674 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 87/594 [12:24<1:05:56, 7.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 87/594 [12:24<1:05:56, 7.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 87/594 [12:24<1:05:56, 7.80s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 88/594 [12:31<1:04:14, 7.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 88/594 [12:31<1:04:14, 7.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2933, 'learning_rate': 1.7199999999999998e-05, 'epoch': 0.15} 15%|███████████▊ | 88/594 [12:31<1:04:14, 7.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:03,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:03,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2668, 'learning_rate': 1.74e-05, 'epoch': 0.15} [WARNING|modeling_utils.py:388] 2022-03-02 02:29:03,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:03,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:28:51,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 90/594 [12:45<1:00:19, 7.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:12,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 90/594 [12:45<1:00:19, 7.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:12,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3194, 'learning_rate': 1.76e-05, 'epoch': 0.15} 15%|████████████ | 90/594 [12:45<1:00:19, 7.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:12,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▌ | 91/594 [12:51<57:28, 6.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▌ | 91/594 [12:51<57:28, 6.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2625, 'learning_rate': 1.78e-05, 'epoch': 0.15} [WARNING|modeling_utils.py:388] 2022-03-02 02:29:22,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:22,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3701, 'learning_rate': 1.8e-05, 'epoch': 0.15} [WARNING|modeling_utils.py:388] 2022-03-02 02:29:26,266 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▊ | 93/594 [13:02<51:24, 6.16s/it]g-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▊ | 93/594 [13:02<51:24, 6.16s/it]g-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:30,133 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:32,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:32,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4905, 'learning_rate': 1.84e-05, 'epoch': 0.16} [WARNING|modeling_utils.py:388] 2022-03-02 02:29:35,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:18,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████ | 95/594 [13:11<44:46, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:38,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████ | 95/594 [13:11<44:46, 5.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:38,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:40,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:38,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▎ | 96/594 [13:15<41:24, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:42,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▎ | 96/594 [13:15<41:24, 4.99s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:42,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▍ | 97/594 [13:19<38:05, 4.60s/it]g-point operations will not be computed-02 02:29:42,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▍ | 97/594 [13:19<38:05, 4.60s/it]g-point operations will not be computed-02 02:29:42,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▍ | 97/594 [13:19<38:05, 4.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:45,598 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▌ | 98/594 [13:22<34:39, 4.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:48,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▌ | 98/594 [13:22<34:39, 4.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:48,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 99/594 [13:25<31:13, 3.79s/it]g-point operations will not be computed-02 02:29:48,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 99/594 [13:25<31:13, 3.79s/it]g-point operations will not be computed-02 02:29:48,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:29:52,559 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:51,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 100/594 [13:28<28:58, 3.52s/it]g-point operations will not be computed-02 02:29:51,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 100/594 [13:28<28:58, 3.52s/it]g-point operations will not be computed-02 02:29:51,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 100/594 [13:28<28:58, 3.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 100/594 [13:28<28:58, 3.52s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:30:01,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:30:01,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 101/594 [13:39<46:37, 5.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 101/594 [13:39<46:37, 5.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 101/594 [13:39<46:37, 5.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 101/594 [13:39<46:37, 5.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 101/594 [13:39<46:37, 5.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 102/594 [13:49<57:49, 7.05s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 102/594 [13:49<57:49, 7.05s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 102/594 [13:49<57:49, 7.05s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 102/594 [13:49<57:49, 7.05s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▉ | 102/594 [13:49<57:49, 7.05s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 103/594 [13:59<1:04:59, 7.94s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 103/594 [13:59<1:04:59, 7.94s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 103/594 [13:59<1:04:59, 7.94s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 103/594 [13:59<1:04:59, 7.94s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 103/594 [13:59<1:04:59, 7.94s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 104/594 [14:09<1:09:59, 8.57s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 104/594 [14:09<1:09:59, 8.57s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 104/594 [14:09<1:09:59, 8.57s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 104/594 [14:09<1:09:59, 8.57s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 104/594 [14:09<1:09:59, 8.57s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 105/594 [14:19<1:13:11, 8.98s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 105/594 [14:19<1:13:11, 8.98s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 105/594 [14:19<1:13:11, 8.98s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 105/594 [14:19<1:13:11, 8.98s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1543, 'learning_rate': 2.08e-05, 'epoch': 0.18} g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 107/594 [14:39<1:16:36, 9.44s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 107/594 [14:39<1:16:36, 9.44s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2942, 'learning_rate': 2.1e-05, 'epoch': 0.18} 18%|██████████████▏ | 107/594 [14:39<1:16:36, 9.44s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 107/594 [14:39<1:16:36, 9.44s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 108/594 [14:49<1:17:21, 9.55s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 108/594 [14:49<1:17:21, 9.55s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3168, 'learning_rate': 2.12e-05, 'epoch': 0.18} 18%|██████████████▎ | 108/594 [14:49<1:17:21, 9.55s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 108/594 [14:49<1:17:21, 9.55s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 109/594 [14:58<1:17:31, 9.59s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 109/594 [14:58<1:17:31, 9.59s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3971, 'learning_rate': 2.1400000000000002e-05, 'epoch': 0.18} 18%|██████████████▍ | 109/594 [14:58<1:17:31, 9.59s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 109/594 [14:58<1:17:31, 9.59s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▋ | 110/594 [15:08<1:17:42, 9.63s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▋ | 110/594 [15:08<1:17:42, 9.63s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.206, 'learning_rate': 2.16e-05, 'epoch': 0.18} 19%|██████████████▋ | 110/594 [15:08<1:17:42, 9.63s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▋ | 110/594 [15:08<1:17:42, 9.63s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 111/594 [15:18<1:17:45, 9.66s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 111/594 [15:18<1:17:45, 9.66s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2, 'learning_rate': 2.18e-05, 'epoch': 0.19} 19%|██████████████▊ | 111/594 [15:18<1:17:45, 9.66s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 111/594 [15:18<1:17:45, 9.66s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 112/594 [15:27<1:17:07, 9.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 112/594 [15:27<1:17:07, 9.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2552, 'learning_rate': 2.2000000000000003e-05, 'epoch': 0.19} 19%|██████████████▉ | 112/594 [15:27<1:17:07, 9.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 112/594 [15:27<1:17:07, 9.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 112/594 [15:27<1:17:07, 9.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 113/594 [15:37<1:16:39, 9.56s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 113/594 [15:37<1:16:39, 9.56s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 113/594 [15:37<1:16:39, 9.56s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 113/594 [15:37<1:16:39, 9.56s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 113/594 [15:37<1:16:39, 9.56s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 114/594 [15:46<1:16:11, 9.52s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 114/594 [15:46<1:16:11, 9.52s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 114/594 [15:46<1:16:11, 9.52s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 114/594 [15:46<1:16:11, 9.52s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 115/594 [15:55<1:15:42, 9.48s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 115/594 [15:55<1:15:42, 9.48s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2514, 'learning_rate': 2.26e-05, 'epoch': 0.19} 19%|███████████████▎ | 115/594 [15:55<1:15:42, 9.48s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 115/594 [15:55<1:15:42, 9.48s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 115/594 [15:55<1:15:42, 9.48s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▍ | 116/594 [16:05<1:14:58, 9.41s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▍ | 116/594 [16:05<1:14:58, 9.41s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▍ | 116/594 [16:05<1:14:58, 9.41s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▍ | 116/594 [16:05<1:14:58, 9.41s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 117/594 [16:14<1:14:13, 9.34s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 117/594 [16:14<1:14:13, 9.34s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3366, 'learning_rate': 2.3000000000000003e-05, 'epoch': 0.2} 20%|███████████████▌ | 117/594 [16:14<1:14:13, 9.34s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 117/594 [16:14<1:14:13, 9.34s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 117/594 [16:14<1:14:13, 9.34s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 118/594 [16:23<1:13:31, 9.27s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 118/594 [16:23<1:13:31, 9.27s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 118/594 [16:23<1:13:31, 9.27s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 118/594 [16:23<1:13:31, 9.27s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 119/594 [16:32<1:13:02, 9.23s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 119/594 [16:32<1:13:02, 9.23s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1132, 'learning_rate': 2.3400000000000003e-05, 'epoch': 0.2} 20%|███████████████▊ | 119/594 [16:32<1:13:02, 9.23s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 119/594 [16:32<1:13:02, 9.23s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 120/594 [16:41<1:12:16, 9.15s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 120/594 [16:41<1:12:16, 9.15s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1045, 'learning_rate': 2.36e-05, 'epoch': 0.2} 20%|███████████████▉ | 120/594 [16:41<1:12:16, 9.15s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 120/594 [16:41<1:12:16, 9.15s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 121/594 [16:50<1:11:36, 9.08s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 121/594 [16:50<1:11:36, 9.08s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3175, 'learning_rate': 2.38e-05, 'epoch': 0.2} 20%|████████████████ | 121/594 [16:50<1:11:36, 9.08s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 121/594 [16:50<1:11:36, 9.08s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 122/594 [16:59<1:10:57, 9.02s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 122/594 [16:59<1:10:57, 9.02s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4242, 'learning_rate': 2.4e-05, 'epoch': 0.21} 21%|████████████████▏ | 122/594 [16:59<1:10:57, 9.02s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 122/594 [16:59<1:10:57, 9.02s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 123/594 [17:08<1:10:19, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 123/594 [17:08<1:10:19, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1832, 'learning_rate': 2.4200000000000002e-05, 'epoch': 0.21} 21%|████████████████▎ | 123/594 [17:08<1:10:19, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 123/594 [17:08<1:10:19, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2404, 'learning_rate': 2.44e-05, 'epoch': 0.21} g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:33:48,530 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:33:48,530 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 125/594 [17:26<1:10:03, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 125/594 [17:26<1:10:03, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 125/594 [17:26<1:10:03, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 125/594 [17:26<1:10:03, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 125/594 [17:26<1:10:03, 8.96s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 126/594 [17:34<1:09:06, 8.86s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 126/594 [17:34<1:09:06, 8.86s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 126/594 [17:34<1:09:06, 8.86s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 126/594 [17:34<1:09:06, 8.86s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 127/594 [17:43<1:08:28, 8.80s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 127/594 [17:43<1:08:28, 8.80s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1251, 'learning_rate': 2.5e-05, 'epoch': 0.21} 21%|████████████████▉ | 127/594 [17:43<1:08:28, 8.80s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 127/594 [17:43<1:08:28, 8.80s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 127/594 [17:43<1:08:28, 8.80s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 128/594 [17:51<1:07:22, 8.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 128/594 [17:51<1:07:22, 8.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 128/594 [17:51<1:07:22, 8.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 128/594 [17:51<1:07:22, 8.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 128/594 [17:51<1:07:22, 8.67s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 129/594 [18:00<1:06:38, 8.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 129/594 [18:00<1:06:38, 8.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 129/594 [18:00<1:06:38, 8.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 129/594 [18:00<1:06:38, 8.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 129/594 [18:00<1:06:38, 8.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 130/594 [18:08<1:06:02, 8.54s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 130/594 [18:08<1:06:02, 8.54s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 130/594 [18:08<1:06:02, 8.54s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 130/594 [18:08<1:06:02, 8.54s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▎ | 130/594 [18:08<1:06:02, 8.54s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 131/594 [18:16<1:05:18, 8.46s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 131/594 [18:16<1:05:18, 8.46s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 131/594 [18:16<1:05:18, 8.46s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▍ | 131/594 [18:16<1:05:18, 8.46s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 132/594 [18:25<1:04:37, 8.39s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 132/594 [18:25<1:04:37, 8.39s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1872, 'learning_rate': 2.6000000000000002e-05, 'epoch': 0.22} 22%|█████████████████▌ | 132/594 [18:25<1:04:37, 8.39s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 132/594 [18:25<1:04:37, 8.39s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 132/594 [18:25<1:04:37, 8.39s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 133/594 [18:32<1:03:16, 8.24s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 133/594 [18:32<1:03:16, 8.24s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 133/594 [18:32<1:03:16, 8.24s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 133/594 [18:32<1:03:16, 8.24s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▋ | 133/594 [18:32<1:03:16, 8.24s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 134/594 [18:40<1:02:08, 8.11s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 134/594 [18:40<1:02:08, 8.11s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 134/594 [18:40<1:02:08, 8.11s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 134/594 [18:40<1:02:08, 8.11s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 134/594 [18:40<1:02:08, 8.11s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 135/594 [18:48<1:01:12, 8.00s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 135/594 [18:48<1:01:12, 8.00s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 135/594 [18:48<1:01:12, 8.00s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 135/594 [18:48<1:01:12, 8.00s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 135/594 [18:48<1:01:12, 8.00s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 136/594 [18:55<1:00:02, 7.86s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:24,908 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:24,908 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:24,908 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 137/594 [19:03<58:55, 7.74s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 137/594 [19:03<58:55, 7.74s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 137/594 [19:03<58:55, 7.74s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 137/594 [19:03<58:55, 7.74s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▋ | 137/594 [19:03<58:55, 7.74s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▊ | 138/594 [19:10<57:44, 7.60s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:39,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:42,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:42,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2113, 'learning_rate': 2.7400000000000002e-05, 'epoch': 0.23} [WARNING|modeling_utils.py:388] 2022-03-02 02:35:42,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:42,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:42,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 140/594 [19:24<55:02, 7.27s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:53,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:53,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:53,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▏ | 141/594 [19:31<52:54, 7.01s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:59,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:59,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:35:59,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▎ | 142/594 [19:37<50:28, 6.70s/it]g-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:05,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:05,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:05,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:29:56,473 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▌ | 143/594 [19:42<47:48, 6.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:11,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:11,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▋ | 144/594 [19:47<44:36, 5.95s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:15,156 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:17,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:17,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:19,557 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:21,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:21,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:23,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:23,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:25,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:26,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:26,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:29,745 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:29,745 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:31,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:33,970 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:33,970 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5584, 'learning_rate': 2.96e-05, 'epoch': 0.25} [WARNING|modeling_utils.py:388] 2022-03-02 02:36:39,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:39,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:39,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:44,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:44,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:44,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:36:44,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 152/594 [20:29<51:29, 6.99s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 152/594 [20:29<51:29, 6.99s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1459, 'learning_rate': 3e-05, 'epoch': 0.26} 26%|████████████████████▋ | 152/594 [20:29<51:29, 6.99s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 152/594 [20:29<51:29, 6.99s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1111, 'learning_rate': 3.02e-05, 'epoch': 0.26} g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 154/594 [20:49<1:02:31, 8.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 154/594 [20:49<1:02:31, 8.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0712, 'learning_rate': 3.04e-05, 'epoch': 0.26} 26%|████████████████████▍ | 154/594 [20:49<1:02:31, 8.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 154/594 [20:49<1:02:31, 8.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 155/594 [20:59<1:05:19, 8.93s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 155/594 [20:59<1:05:19, 8.93s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0859, 'learning_rate': 3.06e-05, 'epoch': 0.26} 26%|████████████████████▌ | 155/594 [20:59<1:05:19, 8.93s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 155/594 [20:59<1:05:19, 8.93s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 155/594 [20:59<1:05:19, 8.93s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 156/594 [21:09<1:07:10, 9.20s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 156/594 [21:09<1:07:10, 9.20s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 156/594 [21:09<1:07:10, 9.20s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 156/594 [21:09<1:07:10, 9.20s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 156/594 [21:09<1:07:10, 9.20s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▉ | 157/594 [21:19<1:08:18, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▉ | 157/594 [21:19<1:08:18, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▉ | 157/594 [21:19<1:08:18, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▉ | 157/594 [21:19<1:08:18, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▉ | 157/594 [21:19<1:08:18, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 158/594 [21:28<1:08:56, 9.49s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 158/594 [21:28<1:08:56, 9.49s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 158/594 [21:28<1:08:56, 9.49s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 158/594 [21:28<1:08:56, 9.49s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 158/594 [21:28<1:08:56, 9.49s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1515, 'learning_rate': 3.16e-05, 'epoch': 0.27} 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 159/594 [21:38<1:09:05, 9.53s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 161/594 [21:57<1:08:43, 9.52s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 161/594 [21:57<1:08:43, 9.52s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 161/594 [21:57<1:08:43, 9.52s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▍ | 161/594 [21:57<1:08:43, 9.52s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 162/594 [22:06<1:08:15, 9.48s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 162/594 [22:06<1:08:15, 9.48s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0791, 'learning_rate': 3.2000000000000005e-05, 'epoch': 0.27} 27%|█████████████████████▌ | 162/594 [22:06<1:08:15, 9.48s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 162/594 [22:06<1:08:15, 9.48s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 163/594 [22:16<1:07:51, 9.45s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 163/594 [22:16<1:07:51, 9.45s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0986, 'learning_rate': 3.2200000000000003e-05, 'epoch': 0.27} 27%|█████████████████████▋ | 163/594 [22:16<1:07:51, 9.45s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 163/594 [22:16<1:07:51, 9.45s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▋ | 163/594 [22:16<1:07:51, 9.45s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 164/594 [22:25<1:07:28, 9.42s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 164/594 [22:25<1:07:28, 9.42s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 164/594 [22:25<1:07:28, 9.42s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 164/594 [22:25<1:07:28, 9.42s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▉ | 165/594 [22:34<1:07:05, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▉ | 165/594 [22:34<1:07:05, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1626, 'learning_rate': 3.26e-05, 'epoch': 0.28} 28%|█████████████████████▉ | 165/594 [22:34<1:07:05, 9.38s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:39:09,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:39:09,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2766, 'learning_rate': 3.2800000000000004e-05, 'epoch': 0.28} [WARNING|modeling_utils.py:388] 2022-03-02 02:39:09,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:39:09,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:39:09,261 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 167/594 [22:53<1:06:03, 9.28s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 167/594 [22:53<1:06:03, 9.28s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1187, 'learning_rate': 3.3e-05, 'epoch': 0.28} 28%|██████████████████████▏ | 167/594 [22:53<1:06:03, 9.28s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 167/594 [22:53<1:06:03, 9.28s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▏ | 167/594 [22:53<1:06:03, 9.28s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 168/594 [23:02<1:05:34, 9.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 168/594 [23:02<1:05:34, 9.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 168/594 [23:02<1:05:34, 9.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 168/594 [23:02<1:05:34, 9.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▍ | 169/594 [23:11<1:05:01, 9.18s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▍ | 169/594 [23:11<1:05:01, 9.18s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1639, 'learning_rate': 3.3400000000000005e-05, 'epoch': 0.28} 28%|██████████████████████▍ | 169/594 [23:11<1:05:01, 9.18s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▍ | 169/594 [23:11<1:05:01, 9.18s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 170/594 [23:20<1:04:33, 9.14s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 170/594 [23:20<1:04:33, 9.14s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2129, 'learning_rate': 3.3600000000000004e-05, 'epoch': 0.29} 29%|██████████████████████▌ | 170/594 [23:20<1:04:33, 9.14s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 170/594 [23:20<1:04:33, 9.14s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 170/594 [23:20<1:04:33, 9.14s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 171/594 [23:29<1:04:09, 9.10s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 171/594 [23:29<1:04:09, 9.10s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 171/594 [23:29<1:04:09, 9.10s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 171/594 [23:29<1:04:09, 9.10s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 171/594 [23:29<1:04:09, 9.10s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 172/594 [23:38<1:03:23, 9.01s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 172/594 [23:38<1:03:23, 9.01s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 172/594 [23:38<1:03:23, 9.01s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 172/594 [23:38<1:03:23, 9.01s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████ | 173/594 [23:47<1:02:46, 8.95s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████ | 173/594 [23:47<1:02:46, 8.95s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1, 'learning_rate': 3.4200000000000005e-05, 'epoch': 0.29} 29%|███████████████████████ | 173/594 [23:47<1:02:46, 8.95s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████ | 173/594 [23:47<1:02:46, 8.95s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▏ | 174/594 [23:55<1:02:03, 8.87s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▏ | 174/594 [23:55<1:02:03, 8.87s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1168, 'learning_rate': 3.4399999999999996e-05, 'epoch': 0.29} 29%|███████████████████████▏ | 174/594 [23:55<1:02:03, 8.87s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▏ | 174/594 [23:55<1:02:03, 8.87s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▏ | 174/594 [23:55<1:02:03, 8.87s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▎ | 175/594 [24:04<1:02:32, 8.96s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▎ | 175/594 [24:04<1:02:32, 8.96s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▎ | 175/594 [24:04<1:02:32, 8.96s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▎ | 175/594 [24:04<1:02:32, 8.96s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▎ | 175/594 [24:04<1:02:32, 8.96s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 176/594 [24:13<1:01:45, 8.86s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 176/594 [24:13<1:01:45, 8.86s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 176/594 [24:13<1:01:45, 8.86s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 176/594 [24:13<1:01:45, 8.86s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▌ | 177/594 [24:22<1:00:50, 8.75s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▌ | 177/594 [24:22<1:00:50, 8.75s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1915, 'learning_rate': 3.5e-05, 'epoch': 0.3} 30%|███████████████████████▌ | 177/594 [24:22<1:00:50, 8.75s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▌ | 177/594 [24:22<1:00:50, 8.75s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 178/594 [24:30<59:54, 8.64s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 178/594 [24:30<59:54, 8.64s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0929, 'learning_rate': 3.52e-05, 'epoch': 0.3} 30%|████████████████████████▎ | 178/594 [24:30<59:54, 8.64s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 178/594 [24:30<59:54, 8.64s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▍ | 179/594 [24:38<59:20, 8.58s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▍ | 179/594 [24:38<59:20, 8.58s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2164, 'learning_rate': 3.54e-05, 'epoch': 0.3} 30%|████████████████████████▍ | 179/594 [24:38<59:20, 8.58s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▍ | 179/594 [24:38<59:20, 8.58s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 180/594 [24:47<58:39, 8.50s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 180/594 [24:47<58:39, 8.50s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0857, 'learning_rate': 3.56e-05, 'epoch': 0.3} 30%|████████████████████████▌ | 180/594 [24:47<58:39, 8.50s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 180/594 [24:47<58:39, 8.50s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 180/594 [24:47<58:39, 8.50s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▋ | 181/594 [24:55<57:49, 8.40s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▋ | 181/594 [24:55<57:49, 8.40s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▋ | 181/594 [24:55<57:49, 8.40s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▋ | 181/594 [24:55<57:49, 8.40s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▋ | 181/594 [24:55<57:49, 8.40s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 182/594 [25:03<57:08, 8.32s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 182/594 [25:03<57:08, 8.32s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 182/594 [25:03<57:08, 8.32s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 182/594 [25:03<57:08, 8.32s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 182/594 [25:03<57:08, 8.32s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 183/594 [25:11<56:26, 8.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 183/594 [25:11<56:26, 8.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 183/594 [25:11<56:26, 8.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 183/594 [25:11<56:26, 8.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 183/594 [25:11<56:26, 8.24s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████ | 184/594 [25:19<55:24, 8.11s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████ | 184/594 [25:19<55:24, 8.11s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████ | 184/594 [25:19<55:24, 8.11s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████ | 184/594 [25:19<55:24, 8.11s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████ | 184/594 [25:19<55:24, 8.11s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▏ | 185/594 [25:27<54:24, 7.98s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▏ | 185/594 [25:27<54:24, 7.98s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▏ | 185/594 [25:27<54:24, 7.98s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:41:59,662 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:41:59,662 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1857, 'learning_rate': 3.68e-05, 'epoch': 0.31} [WARNING|modeling_utils.py:388] 2022-03-02 02:42:03,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:03,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:03,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▌ | 187/594 [25:41<52:13, 7.70s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▌ | 187/594 [25:41<52:13, 7.70s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▌ | 187/594 [25:41<52:13, 7.70s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▌ | 187/594 [25:41<52:13, 7.70s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▌ | 187/594 [25:41<52:13, 7.70s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▋ | 188/594 [25:49<51:05, 7.55s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▋ | 188/594 [25:49<51:05, 7.55s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:19,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:19,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 189/594 [25:56<49:42, 7.36s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 189/594 [25:56<49:42, 7.36s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▊ | 189/594 [25:56<49:42, 7.36s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:27,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:27,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1092, 'learning_rate': 3.76e-05, 'epoch': 0.32} [WARNING|modeling_utils.py:388] 2022-03-02 02:42:27,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:34,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:34,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4034, 'learning_rate': 3.7800000000000004e-05, 'epoch': 0.32} [WARNING|modeling_utils.py:388] 2022-03-02 02:42:34,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:40,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:40,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3196, 'learning_rate': 3.8e-05, 'epoch': 0.32} [WARNING|modeling_utils.py:388] 2022-03-02 02:42:44,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:44,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|██████████████████████████▎ | 193/594 [26:20<41:46, 6.25s/it]g-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:48,094 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:50,536 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:50,536 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3066, 'learning_rate': 3.8400000000000005e-05, 'epoch': 0.33} [WARNING|modeling_utils.py:388] 2022-03-02 02:42:54,127 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:54,127 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:36:09,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 195/594 [26:30<36:51, 5.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:42:56,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:58,401 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:42:56,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:42:58,401 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:42:56,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 196/594 [26:34<33:58, 5.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:43:00,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:43:02,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:00,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:43:02,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:00,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▊ | 197/594 [26:37<30:58, 4.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:43:03,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▊ | 197/594 [26:37<30:58, 4.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:43:03,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 198/594 [26:41<28:00, 4.24s/it]g-point operations will not be computed-02 02:43:03,959 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:43:08,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:07,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:43:08,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:07,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:43:10,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:09,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:43:10,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:09,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 200/594 [26:46<23:12, 3.53s/it]g-point operations will not be computed-02 02:43:09,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 200/594 [26:46<23:12, 3.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▎ | 200/594 [26:46<23:12, 3.53s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:43:20,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▍ | 201/594 [26:57<37:08, 5.67s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▍ | 201/594 [26:57<37:08, 5.67s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.035, 'learning_rate': 3.9800000000000005e-05, 'epoch': 0.34} 34%|███████████████████████████▍ | 201/594 [26:57<37:08, 5.67s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▍ | 201/594 [26:57<37:08, 5.67s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 202/594 [27:07<46:00, 7.04s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 202/594 [27:07<46:00, 7.04s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1883, 'learning_rate': 4e-05, 'epoch': 0.34} 34%|███████████████████████████▌ | 202/594 [27:07<46:00, 7.04s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▌ | 202/594 [27:07<46:00, 7.04s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 203/594 [27:17<51:54, 7.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 203/594 [27:17<51:54, 7.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2433, 'learning_rate': 4.02e-05, 'epoch': 0.34} 34%|███████████████████████████▋ | 203/594 [27:17<51:54, 7.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 203/594 [27:17<51:54, 7.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 204/594 [27:27<55:46, 8.58s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 204/594 [27:27<55:46, 8.58s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0817, 'learning_rate': 4.0400000000000006e-05, 'epoch': 0.34} 34%|███████████████████████████▊ | 204/594 [27:27<55:46, 8.58s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 204/594 [27:27<55:46, 8.58s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 205/594 [27:37<58:09, 8.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 205/594 [27:37<58:09, 8.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1543, 'learning_rate': 4.0600000000000004e-05, 'epoch': 0.34} 35%|███████████████████████████▉ | 205/594 [27:37<58:09, 8.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 205/594 [27:37<58:09, 8.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 205/594 [27:37<58:09, 8.97s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 206/594 [27:47<59:31, 9.21s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 206/594 [27:47<59:31, 9.21s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 206/594 [27:47<59:31, 9.21s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 206/594 [27:47<59:31, 9.21s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 206/594 [27:47<59:31, 9.21s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 207/594 [27:57<1:00:19, 9.35s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 207/594 [27:57<1:00:19, 9.35s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 207/594 [27:57<1:00:19, 9.35s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 207/594 [27:57<1:00:19, 9.35s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 207/594 [27:57<1:00:19, 9.35s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 208/594 [28:06<1:00:53, 9.46s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 208/594 [28:06<1:00:53, 9.46s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 208/594 [28:06<1:00:53, 9.46s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▋ | 208/594 [28:06<1:00:53, 9.46s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 209/594 [28:16<1:00:59, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 209/594 [28:16<1:00:59, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.175, 'learning_rate': 4.14e-05, 'epoch': 0.35} 35%|███████████████████████████▊ | 209/594 [28:16<1:00:59, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 209/594 [28:16<1:00:59, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▊ | 209/594 [28:16<1:00:59, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 210/594 [28:26<1:00:58, 9.53s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 210/594 [28:26<1:00:58, 9.53s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 210/594 [28:26<1:00:58, 9.53s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 210/594 [28:26<1:00:58, 9.53s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▉ | 210/594 [28:26<1:00:58, 9.53s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████ | 211/594 [28:35<1:00:42, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████ | 211/594 [28:35<1:00:42, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████ | 211/594 [28:35<1:00:42, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████ | 211/594 [28:35<1:00:42, 9.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.203, 'learning_rate': 4.2e-05, 'epoch': 0.36} g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 213/594 [28:54<59:42, 9.40s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 213/594 [28:54<59:42, 9.40s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 213/594 [28:54<59:42, 9.40s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 213/594 [28:54<59:42, 9.40s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▏ | 214/594 [29:03<59:25, 9.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▏ | 214/594 [29:03<59:25, 9.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2543, 'learning_rate': 4.24e-05, 'epoch': 0.36} 36%|█████████████████████████████▏ | 214/594 [29:03<59:25, 9.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▏ | 214/594 [29:03<59:25, 9.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▏ | 214/594 [29:03<59:25, 9.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 215/594 [29:12<59:01, 9.34s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 215/594 [29:12<59:01, 9.34s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 215/594 [29:12<59:01, 9.34s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 215/594 [29:12<59:01, 9.34s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▍ | 216/594 [29:22<58:43, 9.32s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▍ | 216/594 [29:22<58:43, 9.32s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.149, 'learning_rate': 4.2800000000000004e-05, 'epoch': 0.36} 36%|█████████████████████████████▍ | 216/594 [29:22<58:43, 9.32s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▍ | 216/594 [29:22<58:43, 9.32s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 217/594 [29:31<58:21, 9.29s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 217/594 [29:31<58:21, 9.29s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0823, 'learning_rate': 4.3e-05, 'epoch': 0.36} 37%|█████████████████████████████▌ | 217/594 [29:31<58:21, 9.29s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 217/594 [29:31<58:21, 9.29s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▋ | 218/594 [29:40<57:46, 9.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▋ | 218/594 [29:40<57:46, 9.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1403, 'learning_rate': 4.32e-05, 'epoch': 0.37} 37%|█████████████████████████████▋ | 218/594 [29:40<57:46, 9.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▋ | 218/594 [29:40<57:46, 9.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 219/594 [29:49<57:09, 9.14s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 219/594 [29:49<57:09, 9.14s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.124, 'learning_rate': 4.3400000000000005e-05, 'epoch': 0.37} 37%|█████████████████████████████▊ | 219/594 [29:49<57:09, 9.14s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 219/594 [29:49<57:09, 9.14s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 219/594 [29:49<57:09, 9.14s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 220/594 [29:58<56:40, 9.09s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 220/594 [29:58<56:40, 9.09s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 220/594 [29:58<56:40, 9.09s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 220/594 [29:58<56:40, 9.09s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 220/594 [29:58<56:40, 9.09s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 221/594 [30:07<56:20, 9.06s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 221/594 [30:07<56:20, 9.06s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 221/594 [30:07<56:20, 9.06s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 221/594 [30:07<56:20, 9.06s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▏ | 221/594 [30:07<56:20, 9.06s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2563, 'learning_rate': 4.4000000000000006e-05, 'epoch': 0.37} 37%|██████████████████████████████▏ | 221/594 [30:07<56:20, 9.06s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:46:45,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:46:45,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:46:45,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:46:45,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1024, 'learning_rate': 4.4200000000000004e-05, 'epoch': 0.37} [WARNING|modeling_utils.py:388] 2022-03-02 02:46:54,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:46:54,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:46:54,524 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 224/594 [30:33<54:55, 8.91s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 224/594 [30:33<54:55, 8.91s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:47:05,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 225/594 [30:42<55:19, 9.00s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 225/594 [30:42<55:19, 9.00s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3883, 'learning_rate': 4.46e-05, 'epoch': 0.38} 38%|██████████████████████████████▋ | 225/594 [30:42<55:19, 9.00s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 225/594 [30:42<55:19, 9.00s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 226/594 [30:51<54:32, 8.89s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 226/594 [30:51<54:32, 8.89s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1431, 'learning_rate': 4.4800000000000005e-05, 'epoch': 0.38} 38%|██████████████████████████████▊ | 226/594 [30:51<54:32, 8.89s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 226/594 [30:51<54:32, 8.89s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 226/594 [30:51<54:32, 8.89s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▉ | 227/594 [31:00<53:35, 8.76s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▉ | 227/594 [31:00<53:35, 8.76s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▉ | 227/594 [31:00<53:35, 8.76s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▉ | 227/594 [31:00<53:35, 8.76s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▉ | 227/594 [31:00<53:35, 8.76s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 228/594 [31:08<52:55, 8.68s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 228/594 [31:08<52:55, 8.68s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 228/594 [31:08<52:55, 8.68s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 228/594 [31:08<52:55, 8.68s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▏ | 229/594 [31:16<52:17, 8.60s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▏ | 229/594 [31:16<52:17, 8.60s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2887, 'learning_rate': 4.5400000000000006e-05, 'epoch': 0.39} 39%|███████████████████████████████▏ | 229/594 [31:16<52:17, 8.60s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▏ | 229/594 [31:16<52:17, 8.60s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 230/594 [31:25<51:33, 8.50s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 230/594 [31:25<51:33, 8.50s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2302, 'learning_rate': 4.5600000000000004e-05, 'epoch': 0.39} 39%|███████████████████████████████▎ | 230/594 [31:25<51:33, 8.50s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 230/594 [31:25<51:33, 8.50s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 231/594 [31:33<50:41, 8.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 231/594 [31:33<50:41, 8.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1176, 'learning_rate': 4.58e-05, 'epoch': 0.39} 39%|███████████████████████████████▌ | 231/594 [31:33<50:41, 8.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 231/594 [31:33<50:41, 8.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 231/594 [31:33<50:41, 8.38s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 232/594 [31:41<50:06, 8.31s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 232/594 [31:41<50:06, 8.31s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 232/594 [31:41<50:06, 8.31s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 232/594 [31:41<50:06, 8.31s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 232/594 [31:41<50:06, 8.31s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 233/594 [31:49<49:27, 8.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 233/594 [31:49<49:27, 8.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 233/594 [31:49<49:27, 8.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 233/594 [31:49<49:27, 8.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 233/594 [31:49<49:27, 8.22s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 234/594 [31:57<48:42, 8.12s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 234/594 [31:57<48:42, 8.12s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 234/594 [31:57<48:42, 8.12s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 234/594 [31:57<48:42, 8.12s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 234/594 [31:57<48:42, 8.12s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 235/594 [32:05<47:49, 7.99s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 235/594 [32:05<47:49, 7.99s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 235/594 [32:05<47:49, 7.99s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 235/594 [32:05<47:49, 7.99s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 235/594 [32:05<47:49, 7.99s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 236/594 [32:12<46:48, 7.85s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 236/594 [32:12<46:48, 7.85s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:48:43,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:48:43,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 237/594 [32:19<45:44, 7.69s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 237/594 [32:19<45:44, 7.69s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 237/594 [32:19<45:44, 7.69s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 237/594 [32:19<45:44, 7.69s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 237/594 [32:19<45:44, 7.69s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 238/594 [32:26<44:31, 7.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 238/594 [32:26<44:31, 7.51s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:48:57,343 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:48:57,343 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 239/594 [32:33<43:26, 7.34s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 239/594 [32:33<43:26, 7.34s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 239/594 [32:33<43:26, 7.34s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:05,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:05,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2226, 'learning_rate': 4.76e-05, 'epoch': 0.4} [WARNING|modeling_utils.py:388] 2022-03-02 02:49:05,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:11,932 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:11,932 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1383, 'learning_rate': 4.78e-05, 'epoch': 0.41} [WARNING|modeling_utils.py:388] 2022-03-02 02:49:11,932 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:17,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:17,874 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1834, 'learning_rate': 4.8e-05, 'epoch': 0.41} [WARNING|modeling_utils.py:388] 2022-03-02 02:49:22,149 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:22,149 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▏ | 243/594 [32:58<36:48, 6.29s/it]g-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:26,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:26,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:26,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:43:14,808 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 244/594 [33:03<34:43, 5.95s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:29,920 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:32,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:29,920 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:32,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:29,920 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▍ | 245/594 [33:08<32:18, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:34,406 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:36,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:34,406 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:36,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:34,406 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 246/594 [33:12<29:43, 5.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:38,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:40,176 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:38,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:40,176 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:38,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▋ | 247/594 [33:15<27:03, 4.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:41,922 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 248/594 [33:19<24:28, 4.24s/it]g-point operations will not be computed-02 02:49:41,922 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 248/594 [33:19<24:28, 4.24s/it]g-point operations will not be computed-02 02:49:41,922 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:46,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:45,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:46,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:45,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 249/594 [33:21<21:55, 3.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:47,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 249/594 [33:21<21:55, 3.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:47,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 250/594 [33:24<20:17, 3.54s/it]g-point operations will not be computed-02 02:49:47,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 250/594 [33:24<20:17, 3.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 250/594 [33:24<20:17, 3.54s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:49:57,998 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▏ | 251/594 [33:35<32:12, 5.63s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▏ | 251/594 [33:35<32:12, 5.63s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2267, 'learning_rate': 4.9800000000000004e-05, 'epoch': 0.42} 42%|██████████████████████████████████▏ | 251/594 [33:35<32:12, 5.63s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▏ | 251/594 [33:35<32:12, 5.63s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 252/594 [33:45<39:45, 6.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 252/594 [33:45<39:45, 6.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1948, 'learning_rate': 5e-05, 'epoch': 0.42} 42%|██████████████████████████████████▎ | 252/594 [33:45<39:45, 6.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 252/594 [33:45<39:45, 6.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 252/594 [33:45<39:45, 6.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 253/594 [33:55<44:33, 7.84s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 253/594 [33:55<44:33, 7.84s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 253/594 [33:55<44:33, 7.84s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 253/594 [33:55<44:33, 7.84s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 253/594 [33:55<44:33, 7.84s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 254/594 [34:05<47:59, 8.47s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 254/594 [34:05<47:59, 8.47s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 254/594 [34:05<47:59, 8.47s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 254/594 [34:05<47:59, 8.47s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 255/594 [34:14<50:03, 8.86s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 255/594 [34:14<50:03, 8.86s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2676, 'learning_rate': 5.0600000000000003e-05, 'epoch': 0.43} 43%|██████████████████████████████████▊ | 255/594 [34:14<50:03, 8.86s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 255/594 [34:14<50:03, 8.86s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 256/594 [34:24<51:25, 9.13s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 256/594 [34:24<51:25, 9.13s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1532, 'learning_rate': 5.08e-05, 'epoch': 0.43} 43%|██████████████████████████████████▉ | 256/594 [34:24<51:25, 9.13s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 256/594 [34:24<51:25, 9.13s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 257/594 [34:34<52:24, 9.33s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 257/594 [34:34<52:24, 9.33s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1267, 'learning_rate': 5.1000000000000006e-05, 'epoch': 0.43} 43%|███████████████████████████████████ | 257/594 [34:34<52:24, 9.33s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 257/594 [34:34<52:24, 9.33s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████ | 257/594 [34:34<52:24, 9.33s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 258/594 [34:44<52:52, 9.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 258/594 [34:44<52:52, 9.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 258/594 [34:44<52:52, 9.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 258/594 [34:44<52:52, 9.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 259/594 [34:54<53:17, 9.54s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 259/594 [34:54<53:17, 9.54s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2355, 'learning_rate': 5.14e-05, 'epoch': 0.44} 44%|███████████████████████████████████▎ | 259/594 [34:54<53:17, 9.54s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 259/594 [34:54<53:17, 9.54s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 260/594 [35:03<53:13, 9.56s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 260/594 [35:03<53:13, 9.56s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2747, 'learning_rate': 5.16e-05, 'epoch': 0.44} 44%|███████████████████████████████████▍ | 260/594 [35:03<53:13, 9.56s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 260/594 [35:03<53:13, 9.56s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 260/594 [35:03<53:13, 9.56s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 261/594 [35:13<52:49, 9.52s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 261/594 [35:13<52:49, 9.52s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 261/594 [35:13<52:49, 9.52s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 261/594 [35:13<52:49, 9.52s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 262/594 [35:22<52:30, 9.49s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 262/594 [35:22<52:30, 9.49s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1059, 'learning_rate': 5.2000000000000004e-05, 'epoch': 0.44} 44%|███████████████████████████████████▋ | 262/594 [35:22<52:30, 9.49s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 262/594 [35:22<52:30, 9.49s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 263/594 [35:31<52:08, 9.45s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 263/594 [35:31<52:08, 9.45s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2589, 'learning_rate': 5.22e-05, 'epoch': 0.44} 44%|███████████████████████████████████▊ | 263/594 [35:31<52:08, 9.45s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▊ | 263/594 [35:31<52:08, 9.45s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 264/594 [35:41<51:50, 9.42s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 264/594 [35:41<51:50, 9.42s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9687, 'learning_rate': 5.2400000000000007e-05, 'epoch': 0.44} 44%|████████████████████████████████████ | 264/594 [35:41<51:50, 9.42s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 264/594 [35:41<51:50, 9.42s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 265/594 [35:50<51:24, 9.38s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 265/594 [35:50<51:24, 9.38s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0608, 'learning_rate': 5.2600000000000005e-05, 'epoch': 0.45} 45%|████████████████████████████████████▏ | 265/594 [35:50<51:24, 9.38s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 265/594 [35:50<51:24, 9.38s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 265/594 [35:50<51:24, 9.38s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 267/594 [36:08<50:29, 9.27s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 267/594 [36:08<50:29, 9.27s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 267/594 [36:08<50:29, 9.27s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 267/594 [36:08<50:29, 9.27s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 267/594 [36:08<50:29, 9.27s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2123, 'learning_rate': 5.3400000000000004e-05, 'epoch': 0.45} 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 268/594 [36:17<49:57, 9.19s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 270/594 [36:35<49:04, 9.09s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 270/594 [36:35<49:04, 9.09s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 270/594 [36:35<49:04, 9.09s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 270/594 [36:35<49:04, 9.09s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 271/594 [36:44<48:37, 9.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 271/594 [36:44<48:37, 9.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1167, 'learning_rate': 5.380000000000001e-05, 'epoch': 0.46} 46%|████████████████████████████████████▉ | 271/594 [36:44<48:37, 9.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 271/594 [36:44<48:37, 9.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 272/594 [36:53<48:11, 8.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 272/594 [36:53<48:11, 8.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.291, 'learning_rate': 5.4000000000000005e-05, 'epoch': 0.46} 46%|█████████████████████████████████████ | 272/594 [36:53<48:11, 8.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 272/594 [36:53<48:11, 8.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 272/594 [36:53<48:11, 8.98s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 273/594 [37:02<47:40, 8.91s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 273/594 [37:02<47:40, 8.91s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 273/594 [37:02<47:40, 8.91s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 273/594 [37:02<47:40, 8.91s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 274/594 [37:11<47:20, 8.88s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 274/594 [37:11<47:20, 8.88s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1663, 'learning_rate': 5.440000000000001e-05, 'epoch': 0.46} 46%|█████████████████████████████████████▎ | 274/594 [37:11<47:20, 8.88s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 274/594 [37:11<47:20, 8.88s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 274/594 [37:11<47:20, 8.88s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 275/594 [37:20<47:46, 8.99s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 275/594 [37:20<47:46, 8.99s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 275/594 [37:20<47:46, 8.99s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 275/594 [37:20<47:46, 8.99s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▌ | 275/594 [37:20<47:46, 8.99s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▋ | 276/594 [37:29<47:09, 8.90s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▋ | 276/594 [37:29<47:09, 8.90s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▋ | 276/594 [37:29<47:09, 8.90s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▋ | 276/594 [37:29<47:09, 8.90s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 277/594 [37:37<46:27, 8.79s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 277/594 [37:37<46:27, 8.79s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2805, 'learning_rate': 5.500000000000001e-05, 'epoch': 0.47} 47%|█████████████████████████████████████▊ | 277/594 [37:37<46:27, 8.79s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 277/594 [37:37<46:27, 8.79s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 278/594 [37:46<45:54, 8.72s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 278/594 [37:46<45:54, 8.72s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1222, 'learning_rate': 5.520000000000001e-05, 'epoch': 0.47} 47%|█████████████████████████████████████▉ | 278/594 [37:46<45:54, 8.72s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 278/594 [37:46<45:54, 8.72s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 279/594 [37:54<45:12, 8.61s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 279/594 [37:54<45:12, 8.61s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3026, 'learning_rate': 5.5400000000000005e-05, 'epoch': 0.47} 47%|██████████████████████████████████████ | 279/594 [37:54<45:12, 8.61s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 279/594 [37:54<45:12, 8.61s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 280/594 [38:02<44:38, 8.53s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 280/594 [38:02<44:38, 8.53s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3612, 'learning_rate': 5.560000000000001e-05, 'epoch': 0.47} 47%|██████████████████████████████████████▏ | 280/594 [38:02<44:38, 8.53s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 280/594 [38:02<44:38, 8.53s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▎ | 281/594 [38:11<44:01, 8.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▎ | 281/594 [38:11<44:01, 8.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3457, 'learning_rate': 5.580000000000001e-05, 'epoch': 0.47} 47%|██████████████████████████████████████▎ | 281/594 [38:11<44:01, 8.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▎ | 281/594 [38:11<44:01, 8.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▎ | 281/594 [38:11<44:01, 8.44s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 282/594 [38:19<43:28, 8.36s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 282/594 [38:19<43:28, 8.36s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 282/594 [38:19<43:28, 8.36s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 282/594 [38:19<43:28, 8.36s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 282/594 [38:19<43:28, 8.36s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 283/594 [38:27<42:48, 8.26s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 283/594 [38:27<42:48, 8.26s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 283/594 [38:27<42:48, 8.26s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 283/594 [38:27<42:48, 8.26s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▌ | 283/594 [38:27<42:48, 8.26s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 284/594 [38:35<42:08, 8.16s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 284/594 [38:35<42:08, 8.16s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 284/594 [38:35<42:08, 8.16s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 284/594 [38:35<42:08, 8.16s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 284/594 [38:35<42:08, 8.16s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 285/594 [38:42<41:22, 8.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 285/594 [38:42<41:22, 8.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 285/594 [38:42<41:22, 8.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 285/594 [38:42<41:22, 8.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 285/594 [38:42<41:22, 8.03s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████ | 286/594 [38:50<40:46, 7.94s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████ | 286/594 [38:50<40:46, 7.94s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████ | 286/594 [38:50<40:46, 7.94s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████ | 286/594 [38:50<40:46, 7.94s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████ | 286/594 [38:50<40:46, 7.94s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▏ | 287/594 [38:58<40:00, 7.82s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:27,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:27,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:27,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▎ | 288/594 [39:05<39:00, 7.65s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▎ | 288/594 [39:05<39:00, 7.65s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▎ | 288/594 [39:05<39:00, 7.65s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:37,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:37,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2049, 'learning_rate': 5.74e-05, 'epoch': 0.49} [WARNING|modeling_utils.py:388] 2022-03-02 02:55:37,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:37,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:37,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▌ | 290/594 [39:19<36:36, 7.23s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:47,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:47,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:47,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▋ | 291/594 [39:25<35:21, 7.00s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:53,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:53,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:53,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▊ | 292/594 [39:31<33:41, 6.69s/it]g-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:59,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:59,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:55:59,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:49:52,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 293/594 [39:37<32:01, 6.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:56:03,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 293/594 [39:37<32:01, 6.38s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:56:03,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:07,623 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:03,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:07,623 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:03,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:10,123 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:03,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:10,123 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:03,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:10,123 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:03,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▏ | 295/594 [39:47<28:13, 5.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:56:13,622 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:15,700 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:13,622 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:15,700 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:13,622 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 296/594 [39:51<26:05, 5.25s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:56:17,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:19,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:17,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:19,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:17,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 297/594 [39:55<23:51, 4.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:56:21,468 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 298/594 [39:58<21:32, 4.37s/it]g-point operations will not be computed-02 02:56:21,468 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 298/594 [39:58<21:32, 4.37s/it]g-point operations will not be computed-02 02:56:21,468 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:25,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:24,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:25,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:24,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:28,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:27,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:28,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:27,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████▉ | 300/594 [40:04<17:32, 3.58s/it]g-point operations will not be computed-02 02:56:27,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████▉ | 300/594 [40:04<17:32, 3.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████▉ | 300/594 [40:04<17:32, 3.58s/it][WARNING|modeling_utils.py:388] 2022-03-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 02:56:37,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 301/594 [40:14<27:31, 5.64s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 301/594 [40:14<27:31, 5.64s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3368, 'learning_rate': 5.9800000000000003e-05, 'epoch': 0.51} 51%|█████████████████████████████████████████ | 301/594 [40:14<27:31, 5.64s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 301/594 [40:14<27:31, 5.64s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 301/594 [40:14<27:31, 5.64s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 302/594 [40:24<33:51, 6.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 302/594 [40:24<33:51, 6.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 302/594 [40:24<33:51, 6.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 302/594 [40:24<33:51, 6.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 302/594 [40:24<33:51, 6.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 303/594 [40:34<38:06, 7.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 303/594 [40:34<38:06, 7.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 303/594 [40:34<38:06, 7.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 303/594 [40:34<38:06, 7.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 303/594 [40:34<38:06, 7.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 304/594 [40:44<40:50, 8.45s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 304/594 [40:44<40:50, 8.45s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 304/594 [40:44<40:50, 8.45s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 304/594 [40:44<40:50, 8.45s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 305/594 [40:54<42:41, 8.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 305/594 [40:54<42:41, 8.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3066, 'learning_rate': 6.06e-05, 'epoch': 0.51} 51%|█████████████████████████████████████████▌ | 305/594 [40:54<42:41, 8.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 305/594 [40:54<42:41, 8.86s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 306/594 [41:04<43:55, 9.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 306/594 [41:04<43:55, 9.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2898, 'learning_rate': 6.08e-05, 'epoch': 0.51} 52%|█████████████████████████████████████████▋ | 306/594 [41:04<43:55, 9.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 306/594 [41:04<43:55, 9.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 306/594 [41:04<43:55, 9.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 307/594 [41:13<44:35, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 307/594 [41:13<44:35, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 307/594 [41:13<44:35, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 307/594 [41:13<44:35, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 308/594 [41:23<44:59, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 308/594 [41:23<44:59, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2885, 'learning_rate': 6.12e-05, 'epoch': 0.52} 52%|██████████████████████████████████████████ | 308/594 [41:23<44:59, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 308/594 [41:23<44:59, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 309/594 [41:33<45:01, 9.48s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 309/594 [41:33<45:01, 9.48s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2159, 'learning_rate': 6.14e-05, 'epoch': 0.52} 52%|██████████████████████████████████████████▏ | 309/594 [41:33<45:01, 9.48s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 309/594 [41:33<45:01, 9.48s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 310/594 [41:42<44:56, 9.50s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 310/594 [41:42<44:56, 9.50s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2852, 'learning_rate': 6.16e-05, 'epoch': 0.52} 52%|██████████████████████████████████████████▎ | 310/594 [41:42<44:56, 9.50s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 310/594 [41:42<44:56, 9.50s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 310/594 [41:42<44:56, 9.50s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 311/594 [41:52<44:39, 9.47s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 311/594 [41:52<44:39, 9.47s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 311/594 [41:52<44:39, 9.47s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 311/594 [41:52<44:39, 9.47s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▍ | 311/594 [41:52<44:39, 9.47s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 312/594 [42:01<44:20, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 312/594 [42:01<44:20, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 312/594 [42:01<44:20, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 312/594 [42:01<44:20, 9.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▋ | 313/594 [42:10<44:04, 9.41s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▋ | 313/594 [42:10<44:04, 9.41s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.196, 'learning_rate': 6.220000000000001e-05, 'epoch': 0.53} 53%|██████████████████████████████████████████▋ | 313/594 [42:10<44:04, 9.41s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▋ | 313/594 [42:10<44:04, 9.41s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▋ | 313/594 [42:10<44:04, 9.41s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 314/594 [42:20<43:43, 9.37s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 314/594 [42:20<43:43, 9.37s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 314/594 [42:20<43:43, 9.37s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 314/594 [42:20<43:43, 9.37s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 314/594 [42:20<43:43, 9.37s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 315/594 [42:29<43:20, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 315/594 [42:29<43:20, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 315/594 [42:29<43:20, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 315/594 [42:29<43:20, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 315/594 [42:29<43:20, 9.32s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 316/594 [42:38<42:53, 9.26s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 316/594 [42:38<42:53, 9.26s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 316/594 [42:38<42:53, 9.26s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 316/594 [42:38<42:53, 9.26s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 317/594 [42:47<42:31, 9.21s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 317/594 [42:47<42:31, 9.21s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1273, 'learning_rate': 6.3e-05, 'epoch': 0.53} 53%|███████████████████████████████████████████▏ | 317/594 [42:47<42:31, 9.21s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 317/594 [42:47<42:31, 9.21s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 318/594 [42:56<42:03, 9.14s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 318/594 [42:56<42:03, 9.14s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2596, 'learning_rate': 6.32e-05, 'epoch': 0.53} 54%|███████████████████████████████████████████▎ | 318/594 [42:56<42:03, 9.14s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 318/594 [42:56<42:03, 9.14s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 318/594 [42:56<42:03, 9.14s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 319/594 [43:05<41:40, 9.09s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 319/594 [43:05<41:40, 9.09s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 319/594 [43:05<41:40, 9.09s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 319/594 [43:05<41:40, 9.09s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 320/594 [43:14<41:16, 9.04s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 320/594 [43:14<41:16, 9.04s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2921, 'learning_rate': 6.36e-05, 'epoch': 0.54} 54%|███████████████████████████████████████████▋ | 320/594 [43:14<41:16, 9.04s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 320/594 [43:14<41:16, 9.04s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 321/594 [43:23<40:57, 9.00s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 321/594 [43:23<40:57, 9.00s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0708, 'learning_rate': 6.38e-05, 'epoch': 0.54} 54%|███████████████████████████████████████████▊ | 321/594 [43:23<40:57, 9.00s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 321/594 [43:23<40:57, 9.00s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 321/594 [43:23<40:57, 9.00s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 322/594 [43:32<40:35, 8.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 322/594 [43:32<40:35, 8.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 322/594 [43:32<40:35, 8.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 322/594 [43:32<40:35, 8.96s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 323/594 [43:40<40:05, 8.88s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 323/594 [43:40<40:05, 8.88s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.274, 'learning_rate': 6.42e-05, 'epoch': 0.54} 54%|████████████████████████████████████████████ | 323/594 [43:40<40:05, 8.88s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 323/594 [43:40<40:05, 8.88s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 324/594 [43:49<39:42, 8.83s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 324/594 [43:49<39:42, 8.83s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2403, 'learning_rate': 6.440000000000001e-05, 'epoch': 0.54} 55%|████████████████████████████████████████████▏ | 324/594 [43:49<39:42, 8.83s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 324/594 [43:49<39:42, 8.83s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▎ | 325/594 [43:58<39:58, 8.92s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▎ | 325/594 [43:58<39:58, 8.92s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1817, 'learning_rate': 6.460000000000001e-05, 'epoch': 0.55} 55%|████████████████████████████████████████████▎ | 325/594 [43:58<39:58, 8.92s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▎ | 325/594 [43:58<39:58, 8.92s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 326/594 [44:07<39:20, 8.81s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 326/594 [44:07<39:20, 8.81s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1604, 'learning_rate': 6.48e-05, 'epoch': 0.55} 55%|████████████████████████████████████████████▍ | 326/594 [44:07<39:20, 8.81s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 326/594 [44:07<39:20, 8.81s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 327/594 [44:15<38:37, 8.68s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 327/594 [44:15<38:37, 8.68s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.413, 'learning_rate': 6.500000000000001e-05, 'epoch': 0.55} 55%|████████████████████████████████████████████▌ | 327/594 [44:15<38:37, 8.68s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 327/594 [44:15<38:37, 8.68s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 328/594 [44:24<38:05, 8.59s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 328/594 [44:24<38:05, 8.59s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1049, 'learning_rate': 6.52e-05, 'epoch': 0.55} 55%|████████████████████████████████████████████▋ | 328/594 [44:24<38:05, 8.59s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 328/594 [44:24<38:05, 8.59s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 328/594 [44:24<38:05, 8.59s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 329/594 [44:32<37:37, 8.52s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 329/594 [44:32<37:37, 8.52s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 329/594 [44:32<37:37, 8.52s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 329/594 [44:32<37:37, 8.52s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 329/594 [44:32<37:37, 8.52s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 330/594 [44:40<37:12, 8.46s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 330/594 [44:40<37:12, 8.46s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 330/594 [44:40<37:12, 8.46s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 330/594 [44:40<37:12, 8.46s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 330/594 [44:40<37:12, 8.46s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 331/594 [44:48<36:34, 8.34s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 331/594 [44:48<36:34, 8.34s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 331/594 [44:48<36:34, 8.34s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 331/594 [44:48<36:34, 8.34s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▏ | 331/594 [44:48<36:34, 8.34s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 332/594 [44:56<36:02, 8.25s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 332/594 [44:56<36:02, 8.25s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 332/594 [44:56<36:02, 8.25s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 332/594 [44:56<36:02, 8.25s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 332/594 [44:56<36:02, 8.25s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▍ | 333/594 [45:04<35:26, 8.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▍ | 333/594 [45:04<35:26, 8.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▍ | 333/594 [45:04<35:26, 8.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▍ | 333/594 [45:04<35:26, 8.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▍ | 333/594 [45:04<35:26, 8.15s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 334/594 [45:12<34:47, 8.03s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 334/594 [45:12<34:47, 8.03s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 334/594 [45:12<34:47, 8.03s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 334/594 [45:12<34:47, 8.03s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 334/594 [45:12<34:47, 8.03s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 335/594 [45:20<34:13, 7.93s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 335/594 [45:20<34:13, 7.93s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▋ | 335/594 [45:20<34:13, 7.93s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:01:52,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:01:52,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1627, 'learning_rate': 6.680000000000001e-05, 'epoch': 0.56} [WARNING|modeling_utils.py:388] 2022-03-02 03:01:52,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:01:52,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:01:52,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 337/594 [45:35<32:58, 7.70s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 337/594 [45:35<32:58, 7.70s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 337/594 [45:35<32:58, 7.70s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 337/594 [45:35<32:58, 7.70s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▉ | 337/594 [45:35<32:58, 7.70s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 338/594 [45:42<32:13, 7.55s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:11,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:11,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:11,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▏ | 339/594 [45:49<31:13, 7.35s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▏ | 339/594 [45:49<31:13, 7.35s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:19,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 340/594 [45:55<29:51, 7.05s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 340/594 [45:55<29:51, 7.05s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2993, 'learning_rate': 6.76e-05, 'epoch': 0.57} [WARNING|modeling_utils.py:388] 2022-03-02 03:02:25,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:25,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▌ | 341/594 [46:01<28:29, 6.76s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▌ | 341/594 [46:01<28:29, 6.76s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:31,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:31,131 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 342/594 [46:07<27:04, 6.44s/it]g-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:35,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:37,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:37,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4184, 'learning_rate': 6.82e-05, 'epoch': 0.58} [WARNING|modeling_utils.py:388] 2022-03-02 03:02:41,215 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:41,215 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 02:56:32,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 344/594 [46:17<23:27, 5.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:02:43,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:45,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:02:43,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:45,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:02:43,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████ | 345/594 [46:21<21:30, 5.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:02:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:49,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:02:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:49,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:02:47,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 346/594 [46:25<19:41, 4.77s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:02:51,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▎ | 347/594 [46:28<17:58, 4.37s/it]g-point operations will not be computed-02 03:02:51,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▎ | 347/594 [46:28<17:58, 4.37s/it]g-point operations will not be computed-02 03:02:51,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:55,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:02:54,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:02:55,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:02:54,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 348/594 [46:31<16:15, 3.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:02:57,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 348/594 [46:31<16:15, 3.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:02:57,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▌ | 349/594 [46:34<14:34, 3.57s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:03:00,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:03:01,101 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:03:00,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:03:01,101 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:03:00,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 350/594 [46:37<13:37, 3.35s/it]g-point operations will not be computed-02 03:03:00,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 350/594 [46:37<13:37, 3.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 350/594 [46:37<13:37, 3.35s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:03:10,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:03:10,013 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▊ | 351/594 [46:47<22:03, 5.44s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▊ | 351/594 [46:47<22:03, 5.44s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▊ | 351/594 [46:47<22:03, 5.44s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▊ | 351/594 [46:47<22:03, 5.44s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 352/594 [46:57<27:32, 6.83s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 352/594 [46:57<27:32, 6.83s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2327, 'learning_rate': 7e-05, 'epoch': 0.59} 59%|████████████████████████████████████████████████ | 352/594 [46:57<27:32, 6.83s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 352/594 [46:57<27:32, 6.83s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████▏ | 353/594 [47:07<31:09, 7.76s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████▏ | 353/594 [47:07<31:09, 7.76s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3054, 'learning_rate': 7.02e-05, 'epoch': 0.59} 59%|████████████████████████████████████████████████▏ | 353/594 [47:07<31:09, 7.76s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████▏ | 353/594 [47:07<31:09, 7.76s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 354/594 [47:17<33:27, 8.37s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 354/594 [47:17<33:27, 8.37s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1521, 'learning_rate': 7.04e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▎ | 354/594 [47:17<33:27, 8.37s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 354/594 [47:17<33:27, 8.37s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 355/594 [47:26<34:49, 8.74s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 355/594 [47:26<34:49, 8.74s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2491, 'learning_rate': 7.06e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▍ | 355/594 [47:26<34:49, 8.74s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 355/594 [47:26<34:49, 8.74s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 355/594 [47:26<34:49, 8.74s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2289, 'learning_rate': 7.1e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 356/594 [47:36<35:53, 9.05s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 358/594 [47:55<36:56, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 358/594 [47:55<36:56, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2476, 'learning_rate': 7.12e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▊ | 358/594 [47:55<36:56, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 358/594 [47:55<36:56, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▉ | 359/594 [48:05<37:00, 9.45s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▉ | 359/594 [48:05<37:00, 9.45s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1769, 'learning_rate': 7.14e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▉ | 359/594 [48:05<37:00, 9.45s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▉ | 359/594 [48:05<37:00, 9.45s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 360/594 [48:15<36:59, 9.49s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 360/594 [48:15<36:59, 9.49s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.226, 'learning_rate': 7.16e-05, 'epoch': 0.61} 61%|█████████████████████████████████████████████████ | 360/594 [48:15<36:59, 9.49s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 360/594 [48:15<36:59, 9.49s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 360/594 [48:15<36:59, 9.49s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 361/594 [48:24<36:38, 9.43s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 361/594 [48:24<36:38, 9.43s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 361/594 [48:24<36:38, 9.43s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 361/594 [48:24<36:38, 9.43s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 362/594 [48:33<36:23, 9.41s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 362/594 [48:33<36:23, 9.41s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1832, 'learning_rate': 7.2e-05, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▎ | 362/594 [48:33<36:23, 9.41s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 362/594 [48:33<36:23, 9.41s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 362/594 [48:33<36:23, 9.41s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▌ | 363/594 [48:43<36:09, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▌ | 363/594 [48:43<36:09, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▌ | 363/594 [48:43<36:09, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▌ | 363/594 [48:43<36:09, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 364/594 [48:52<35:58, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 364/594 [48:52<35:58, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1442, 'learning_rate': 7.24e-05, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▋ | 364/594 [48:52<35:58, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 364/594 [48:52<35:58, 9.39s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 365/594 [49:01<35:35, 9.32s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 365/594 [49:01<35:35, 9.32s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2575, 'learning_rate': 7.26e-05, 'epoch': 0.61} 61%|█████████████████████████████████████████████████▊ | 365/594 [49:01<35:35, 9.32s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▊ | 365/594 [49:01<35:35, 9.32s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1256, 'learning_rate': 7.280000000000001e-05, 'epoch': 0.62} 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1973, 'learning_rate': 7.3e-05, 'epoch': 0.62} 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 366/594 [49:10<35:15, 9.28s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 368/594 [49:29<34:37, 9.19s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 368/594 [49:29<34:37, 9.19s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1011, 'learning_rate': 7.32e-05, 'epoch': 0.62} 62%|██████████████████████████████████████████████████▏ | 368/594 [49:29<34:37, 9.19s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 368/594 [49:29<34:37, 9.19s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 369/594 [49:38<34:10, 9.11s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 369/594 [49:38<34:10, 9.11s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2649, 'learning_rate': 7.340000000000001e-05, 'epoch': 0.62} 62%|██████████████████████████████████████████████████▎ | 369/594 [49:38<34:10, 9.11s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▎ | 369/594 [49:38<34:10, 9.11s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 370/594 [49:46<33:48, 9.06s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 370/594 [49:46<33:48, 9.06s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2652, 'learning_rate': 7.36e-05, 'epoch': 0.62} 62%|██████████████████████████████████████████████████▍ | 370/594 [49:46<33:48, 9.06s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 370/594 [49:46<33:48, 9.06s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 370/594 [49:46<33:48, 9.06s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 371/594 [49:55<33:33, 9.03s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 371/594 [49:55<33:33, 9.03s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 371/594 [49:55<33:33, 9.03s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 371/594 [49:55<33:33, 9.03s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 372/594 [50:04<33:22, 9.02s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 372/594 [50:04<33:22, 9.02s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1874, 'learning_rate': 7.4e-05, 'epoch': 0.63} 63%|██████████████████████████████████████████████████▋ | 372/594 [50:04<33:22, 9.02s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 372/594 [50:04<33:22, 9.02s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 373/594 [50:13<33:04, 8.98s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 373/594 [50:13<33:04, 8.98s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0624, 'learning_rate': 7.42e-05, 'epoch': 0.63} 63%|██████████████████████████████████████████████████▊ | 373/594 [50:13<33:04, 8.98s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 373/594 [50:13<33:04, 8.98s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▊ | 373/594 [50:13<33:04, 8.98s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 374/594 [50:22<32:49, 8.95s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 374/594 [50:22<32:49, 8.95s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 374/594 [50:22<32:49, 8.95s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 374/594 [50:22<32:49, 8.95s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 375/594 [50:32<33:05, 9.07s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 375/594 [50:32<33:05, 9.07s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2729, 'learning_rate': 7.46e-05, 'epoch': 0.63} 63%|███████████████████████████████████████████████████▏ | 375/594 [50:32<33:05, 9.07s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 375/594 [50:32<33:05, 9.07s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▎ | 376/594 [50:40<32:29, 8.94s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▎ | 376/594 [50:40<32:29, 8.94s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3062, 'learning_rate': 7.48e-05, 'epoch': 0.63} 63%|███████████████████████████████████████████████████▎ | 376/594 [50:40<32:29, 8.94s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▎ | 376/594 [50:40<32:29, 8.94s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2091, 'learning_rate': 7.500000000000001e-05, 'epoch': 0.63} 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2752, 'learning_rate': 7.52e-05, 'epoch': 0.64} 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 377/594 [50:49<32:02, 8.86s/it]g-point operations will not be computed-02 03:03:04,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 379/594 [51:06<31:11, 8.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 379/594 [51:06<31:11, 8.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 379/594 [51:06<31:11, 8.71s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▊ | 380/594 [51:14<30:41, 8.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▊ | 380/594 [51:14<30:41, 8.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3214, 'learning_rate': 7.560000000000001e-05, 'epoch': 0.64} 64%|███████████████████████████████████████████████████▊ | 380/594 [51:14<30:41, 8.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▊ | 380/594 [51:14<30:41, 8.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 381/594 [51:23<30:12, 8.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 381/594 [51:23<30:12, 8.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3059, 'learning_rate': 7.58e-05, 'epoch': 0.64} 64%|███████████████████████████████████████████████████▉ | 381/594 [51:23<30:12, 8.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 381/594 [51:23<30:12, 8.51s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 382/594 [51:31<29:46, 8.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 382/594 [51:31<29:46, 8.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1338, 'learning_rate': 7.6e-05, 'epoch': 0.64} 64%|████████████████████████████████████████████████████ | 382/594 [51:31<29:46, 8.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 382/594 [51:31<29:46, 8.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 383/594 [51:39<29:23, 8.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 383/594 [51:39<29:23, 8.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1773, 'learning_rate': 7.620000000000001e-05, 'epoch': 0.64} 64%|████████████████████████████████████████████████████▏ | 383/594 [51:39<29:23, 8.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████▏ | 383/594 [51:39<29:23, 8.36s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 384/594 [51:47<28:50, 8.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 384/594 [51:47<28:50, 8.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2126, 'learning_rate': 7.64e-05, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▎ | 384/594 [51:47<28:50, 8.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 384/594 [51:47<28:50, 8.24s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 385/594 [51:55<28:19, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 385/594 [51:55<28:19, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4449, 'learning_rate': 7.66e-05, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▌ | 385/594 [51:55<28:19, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 385/594 [51:55<28:19, 8.13s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 386/594 [52:03<27:40, 7.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 386/594 [52:03<27:40, 7.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2583, 'learning_rate': 7.680000000000001e-05, 'epoch': 0.65} 65%|████████████████████████████████████████████████████▋ | 386/594 [52:03<27:40, 7.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▋ | 386/594 [52:03<27:40, 7.98s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 387/594 [52:10<27:02, 7.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 387/594 [52:10<27:02, 7.84s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:08:39,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:08:39,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:08:39,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 388/594 [52:17<26:24, 7.69s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 388/594 [52:17<26:24, 7.69s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 388/594 [52:17<26:24, 7.69s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 388/594 [52:17<26:24, 7.69s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 388/594 [52:17<26:24, 7.69s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████████ | 389/594 [52:24<25:39, 7.51s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:08:53,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:08:53,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:08:53,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 390/594 [52:31<24:51, 7.31s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 390/594 [52:31<24:51, 7.31s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 390/594 [52:31<24:51, 7.31s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 390/594 [52:31<24:51, 7.31s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:03,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:03,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:03,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:03,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:09,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:09,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:13,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:13,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▌ | 393/594 [52:49<21:21, 6.38s/it]g-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:17,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:17,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:19,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:19,983 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:23,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:23,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:07:33,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 395/594 [52:59<18:26, 5.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:09:25,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:27,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:25,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:27,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:25,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 396/594 [53:03<16:52, 5.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:09:29,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▏ | 397/594 [53:07<15:24, 4.69s/it]g-point operations will not be computed-02 03:09:29,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▏ | 397/594 [53:07<15:24, 4.69s/it]g-point operations will not be computed-02 03:09:29,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▏ | 397/594 [53:07<15:24, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:09:33,297 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▏ | 397/594 [53:07<15:24, 4.69s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:09:33,297 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▎ | 398/594 [53:10<13:58, 4.28s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:09:36,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 03:09:36,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 03:09:36,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:40,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:39,245 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:40,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:39,245 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 400/594 [53:16<11:38, 3.60s/it]g-point operations will not be computed-02 03:09:39,245 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 400/594 [53:16<11:38, 3.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 400/594 [53:16<11:38, 3.60s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:49,447 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:09:49,447 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 401/594 [53:26<18:13, 5.67s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 401/594 [53:26<18:13, 5.67s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 401/594 [53:26<18:13, 5.67s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 401/594 [53:26<18:13, 5.67s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▋ | 401/594 [53:26<18:13, 5.67s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 402/594 [53:36<22:23, 7.00s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 402/594 [53:36<22:23, 7.00s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 402/594 [53:36<22:23, 7.00s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 402/594 [53:36<22:23, 7.00s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 402/594 [53:36<22:23, 7.00s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▉ | 403/594 [53:46<25:07, 7.90s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▉ | 403/594 [53:46<25:07, 7.90s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▉ | 403/594 [53:46<25:07, 7.90s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▉ | 403/594 [53:46<25:07, 7.90s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 404/594 [53:56<26:53, 8.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 404/594 [53:56<26:53, 8.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2763, 'learning_rate': 8.04e-05, 'epoch': 0.68} 68%|███████████████████████████████████████████████████████ | 404/594 [53:56<26:53, 8.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 404/594 [53:56<26:53, 8.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▏ | 405/594 [54:06<27:59, 8.89s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▏ | 405/594 [54:06<27:59, 8.89s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2155, 'learning_rate': 8.060000000000001e-05, 'epoch': 0.68} 68%|███████████████████████████████████████████████████████▏ | 405/594 [54:06<27:59, 8.89s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▏ | 405/594 [54:06<27:59, 8.89s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 406/594 [54:16<28:41, 9.16s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 406/594 [54:16<28:41, 9.16s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2131, 'learning_rate': 8.080000000000001e-05, 'epoch': 0.68} 68%|███████████████████████████████████████████████████████▎ | 406/594 [54:16<28:41, 9.16s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 406/594 [54:16<28:41, 9.16s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 406/594 [54:16<28:41, 9.16s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 407/594 [54:26<29:06, 9.34s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 407/594 [54:26<29:06, 9.34s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 407/594 [54:26<29:06, 9.34s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 407/594 [54:26<29:06, 9.34s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 408/594 [54:35<29:16, 9.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 408/594 [54:35<29:16, 9.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0873, 'learning_rate': 8.120000000000001e-05, 'epoch': 0.69} 69%|███████████████████████████████████████████████████████▋ | 408/594 [54:35<29:16, 9.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 408/594 [54:35<29:16, 9.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 409/594 [54:45<29:16, 9.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 409/594 [54:45<29:16, 9.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1883, 'learning_rate': 8.14e-05, 'epoch': 0.69} 69%|███████████████████████████████████████████████████████▊ | 409/594 [54:45<29:16, 9.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 409/594 [54:45<29:16, 9.49s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 410/594 [54:55<29:13, 9.53s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 410/594 [54:55<29:13, 9.53s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1759, 'learning_rate': 8.16e-05, 'epoch': 0.69} 69%|███████████████████████████████████████████████████████▉ | 410/594 [54:55<29:13, 9.53s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 410/594 [54:55<29:13, 9.53s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 411/594 [55:04<28:57, 9.50s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 411/594 [55:04<28:57, 9.50s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1658, 'learning_rate': 8.18e-05, 'epoch': 0.69} 69%|████████████████████████████████████████████████████████ | 411/594 [55:04<28:57, 9.50s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 411/594 [55:04<28:57, 9.50s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 411/594 [55:04<28:57, 9.50s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 412/594 [55:14<28:53, 9.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 412/594 [55:14<28:53, 9.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 412/594 [55:14<28:53, 9.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 412/594 [55:14<28:53, 9.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 412/594 [55:14<28:53, 9.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 413/594 [55:23<28:36, 9.48s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 413/594 [55:23<28:36, 9.48s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 413/594 [55:23<28:36, 9.48s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 413/594 [55:23<28:36, 9.48s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▎ | 413/594 [55:23<28:36, 9.48s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 414/594 [55:32<28:13, 9.41s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 414/594 [55:32<28:13, 9.41s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 414/594 [55:32<28:13, 9.41s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 414/594 [55:32<28:13, 9.41s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▌ | 415/594 [55:41<27:54, 9.35s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▌ | 415/594 [55:41<27:54, 9.35s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.235, 'learning_rate': 8.26e-05, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▌ | 415/594 [55:41<27:54, 9.35s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▌ | 415/594 [55:41<27:54, 9.35s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▌ | 415/594 [55:41<27:54, 9.35s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 416/594 [55:51<27:35, 9.30s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 416/594 [55:51<27:35, 9.30s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 416/594 [55:51<27:35, 9.30s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 416/594 [55:51<27:35, 9.30s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 417/594 [56:00<27:14, 9.24s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 417/594 [56:00<27:14, 9.24s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2348, 'learning_rate': 8.3e-05, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▊ | 417/594 [56:00<27:14, 9.24s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 417/594 [56:00<27:14, 9.24s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 417/594 [56:00<27:14, 9.24s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 418/594 [56:09<26:55, 9.18s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 418/594 [56:09<26:55, 9.18s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 418/594 [56:09<26:55, 9.18s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 418/594 [56:09<26:55, 9.18s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 419/594 [56:18<26:36, 9.12s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 419/594 [56:18<26:36, 9.12s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3757, 'learning_rate': 8.34e-05, 'epoch': 0.7} 71%|█████████████████████████████████████████████████████████▏ | 419/594 [56:18<26:36, 9.12s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 419/594 [56:18<26:36, 9.12s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 420/594 [56:27<26:19, 9.08s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 420/594 [56:27<26:19, 9.08s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2326, 'learning_rate': 8.36e-05, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▎ | 420/594 [56:27<26:19, 9.08s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 420/594 [56:27<26:19, 9.08s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 421/594 [56:36<26:00, 9.02s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 421/594 [56:36<26:00, 9.02s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9055, 'learning_rate': 8.38e-05, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▍ | 421/594 [56:36<26:00, 9.02s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 421/594 [56:36<26:00, 9.02s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▍ | 421/594 [56:36<26:00, 9.02s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 422/594 [56:44<25:43, 8.98s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 422/594 [56:44<25:43, 8.98s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 422/594 [56:44<25:43, 8.98s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 422/594 [56:44<25:43, 8.98s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 422/594 [56:44<25:43, 8.98s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▋ | 423/594 [56:53<25:23, 8.91s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▋ | 423/594 [56:53<25:23, 8.91s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▋ | 423/594 [56:53<25:23, 8.91s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▋ | 423/594 [56:53<25:23, 8.91s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2111, 'learning_rate': 8.44e-05, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2131, 'learning_rate': 8.46e-05, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 424/594 [57:02<25:07, 8.87s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 426/594 [57:20<24:48, 8.86s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 426/594 [57:20<24:48, 8.86s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 426/594 [57:20<24:48, 8.86s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 426/594 [57:20<24:48, 8.86s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 426/594 [57:20<24:48, 8.86s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 427/594 [57:28<24:22, 8.76s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 427/594 [57:28<24:22, 8.76s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 427/594 [57:28<24:22, 8.76s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 427/594 [57:28<24:22, 8.76s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 427/594 [57:28<24:22, 8.76s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 428/594 [57:37<23:57, 8.66s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 428/594 [57:37<23:57, 8.66s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 428/594 [57:37<23:57, 8.66s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 428/594 [57:37<23:57, 8.66s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 428/594 [57:37<23:57, 8.66s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 429/594 [57:45<23:33, 8.57s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 429/594 [57:45<23:33, 8.57s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 429/594 [57:45<23:33, 8.57s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▌ | 429/594 [57:45<23:33, 8.57s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 430/594 [57:54<23:17, 8.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 430/594 [57:54<23:17, 8.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2203, 'learning_rate': 8.560000000000001e-05, 'epoch': 0.72} 72%|██████████████████████████████████████████████████████████▋ | 430/594 [57:54<23:17, 8.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 430/594 [57:54<23:17, 8.52s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▊ | 431/594 [58:02<22:55, 8.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▊ | 431/594 [58:02<22:55, 8.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3555, 'learning_rate': 8.58e-05, 'epoch': 0.72} 73%|██████████████████████████████████████████████████████████▊ | 431/594 [58:02<22:55, 8.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▊ | 431/594 [58:02<22:55, 8.44s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 432/594 [58:10<22:35, 8.37s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 432/594 [58:10<22:35, 8.37s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0412, 'learning_rate': 8.6e-05, 'epoch': 0.73} 73%|██████████████████████████████████████████████████████████▉ | 432/594 [58:10<22:35, 8.37s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 432/594 [58:10<22:35, 8.37s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 433/594 [58:18<22:15, 8.29s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 433/594 [58:18<22:15, 8.29s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2636, 'learning_rate': 8.620000000000001e-05, 'epoch': 0.73} 73%|███████████████████████████████████████████████████████████ | 433/594 [58:18<22:15, 8.29s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 433/594 [58:18<22:15, 8.29s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 433/594 [58:18<22:15, 8.29s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 434/594 [58:26<21:53, 8.21s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 434/594 [58:26<21:53, 8.21s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 434/594 [58:26<21:53, 8.21s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 434/594 [58:26<21:53, 8.21s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 434/594 [58:26<21:53, 8.21s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 435/594 [58:34<21:23, 8.07s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 435/594 [58:34<21:23, 8.07s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 435/594 [58:34<21:23, 8.07s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 435/594 [58:34<21:23, 8.07s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▎ | 435/594 [58:34<21:23, 8.07s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▍ | 436/594 [58:42<20:58, 7.97s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▍ | 436/594 [58:42<20:58, 7.97s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▍ | 436/594 [58:42<20:58, 7.97s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:14,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:14,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1347, 'learning_rate': 8.7e-05, 'epoch': 0.73} [WARNING|modeling_utils.py:388] 2022-03-02 03:15:14,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:14,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:14,757 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 438/594 [58:57<20:01, 7.70s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 438/594 [58:57<20:01, 7.70s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 438/594 [58:57<20:01, 7.70s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 438/594 [58:57<20:01, 7.70s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 438/594 [58:57<20:01, 7.70s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▊ | 439/594 [59:04<19:26, 7.53s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:32,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:32,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:32,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████ | 440/594 [59:11<18:49, 7.34s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████ | 440/594 [59:11<18:49, 7.34s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:41,214 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 441/594 [59:17<18:08, 7.12s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 441/594 [59:17<18:08, 7.12s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3363, 'learning_rate': 8.78e-05, 'epoch': 0.74} [WARNING|modeling_utils.py:388] 2022-03-02 03:15:47,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 442/594 [59:23<17:18, 6.83s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▎ | 442/594 [59:23<17:18, 6.83s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6421, 'learning_rate': 8.800000000000001e-05, 'epoch': 0.74} [WARNING|modeling_utils.py:388] 2022-03-02 03:15:53,224 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:53,224 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 443/594 [59:29<16:14, 6.45s/it]g-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:57,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:59,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:15:59,599 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3105, 'learning_rate': 8.840000000000001e-05, 'epoch': 0.75} [WARNING|modeling_utils.py:388] 2022-03-02 03:16:03,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:03,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:09:44,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 445/594 [59:39<13:57, 5.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 445/594 [59:39<13:57, 5.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 445/594 [59:39<13:57, 5.62s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:08,469 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:10,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:10,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:12,302 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:14,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:14,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:17,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:17,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:18,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:21,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:21,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2745, 'learning_rate': 8.960000000000001e-05, 'epoch': 0.76} [WARNING|modeling_utils.py:388] 2022-03-02 03:16:26,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:26,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:31,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:31,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2774, 'learning_rate': 8.98e-05, 'epoch': 0.76} [WARNING|modeling_utils.py:388] 2022-03-02 03:16:31,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:16:31,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2983, 'learning_rate': 9e-05, 'epoch': 0.76} g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 453/594 [1:00:26<18:33, 7.90s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 453/594 [1:00:26<18:33, 7.90s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1945, 'learning_rate': 9.020000000000001e-05, 'epoch': 0.76} 76%|████████████████████████████████████████████████████████████▏ | 453/594 [1:00:26<18:33, 7.90s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▏ | 453/594 [1:00:26<18:33, 7.90s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.182, 'learning_rate': 9.04e-05, 'epoch': 0.76} 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3015, 'learning_rate': 9.06e-05, 'epoch': 0.77} 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.216, 'learning_rate': 9.080000000000001e-05, 'epoch': 0.77} 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 454/594 [1:00:36<19:46, 8.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 457/594 [1:01:05<21:18, 9.33s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 457/594 [1:01:05<21:18, 9.33s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4956, 'learning_rate': 9.1e-05, 'epoch': 0.77} 77%|████████████████████████████████████████████████████████████▊ | 457/594 [1:01:05<21:18, 9.33s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 457/594 [1:01:05<21:18, 9.33s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▊ | 457/594 [1:01:05<21:18, 9.33s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 458/594 [1:01:15<21:23, 9.44s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 458/594 [1:01:15<21:23, 9.44s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 458/594 [1:01:15<21:23, 9.44s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▉ | 458/594 [1:01:15<21:23, 9.44s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████ | 459/594 [1:01:25<21:23, 9.51s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████ | 459/594 [1:01:25<21:23, 9.51s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3564, 'learning_rate': 9.140000000000001e-05, 'epoch': 0.77} 77%|█████████████████████████████████████████████████████████████ | 459/594 [1:01:25<21:23, 9.51s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████ | 459/594 [1:01:25<21:23, 9.51s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████ | 459/594 [1:01:25<21:23, 9.51s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 460/594 [1:01:34<21:20, 9.56s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 460/594 [1:01:34<21:20, 9.56s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 460/594 [1:01:34<21:20, 9.56s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 460/594 [1:01:34<21:20, 9.56s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 460/594 [1:01:34<21:20, 9.56s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▎ | 461/594 [1:01:44<21:09, 9.54s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▎ | 461/594 [1:01:44<21:09, 9.54s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▎ | 461/594 [1:01:44<21:09, 9.54s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▎ | 461/594 [1:01:44<21:09, 9.54s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 462/594 [1:01:53<20:54, 9.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 462/594 [1:01:53<20:54, 9.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1673, 'learning_rate': 9.200000000000001e-05, 'epoch': 0.78} 78%|█████████████████████████████████████████████████████████████▍ | 462/594 [1:01:53<20:54, 9.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 462/594 [1:01:53<20:54, 9.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 462/594 [1:01:53<20:54, 9.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▌ | 463/594 [1:02:03<20:41, 9.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▌ | 463/594 [1:02:03<20:41, 9.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▌ | 463/594 [1:02:03<20:41, 9.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▌ | 463/594 [1:02:03<20:41, 9.48s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 464/594 [1:02:12<20:26, 9.43s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 464/594 [1:02:12<20:26, 9.43s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3931, 'learning_rate': 9.240000000000001e-05, 'epoch': 0.78} 78%|█████████████████████████████████████████████████████████████▋ | 464/594 [1:02:12<20:26, 9.43s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 464/594 [1:02:12<20:26, 9.43s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▊ | 465/594 [1:02:21<20:09, 9.38s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▊ | 465/594 [1:02:21<20:09, 9.38s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3984, 'learning_rate': 9.260000000000001e-05, 'epoch': 0.78} 78%|█████████████████████████████████████████████████████████████▊ | 465/594 [1:02:21<20:09, 9.38s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▊ | 465/594 [1:02:21<20:09, 9.38s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 466/594 [1:02:31<19:56, 9.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 466/594 [1:02:31<19:56, 9.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2236, 'learning_rate': 9.28e-05, 'epoch': 0.78} 78%|█████████████████████████████████████████████████████████████▉ | 466/594 [1:02:31<19:56, 9.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 466/594 [1:02:31<19:56, 9.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 466/594 [1:02:31<19:56, 9.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████ | 467/594 [1:02:40<19:37, 9.27s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████ | 467/594 [1:02:40<19:37, 9.27s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████ | 467/594 [1:02:40<19:37, 9.27s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████ | 467/594 [1:02:40<19:37, 9.27s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 468/594 [1:02:49<19:26, 9.25s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 468/594 [1:02:49<19:26, 9.25s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.188, 'learning_rate': 9.320000000000002e-05, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▏ | 468/594 [1:02:49<19:26, 9.25s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 468/594 [1:02:49<19:26, 9.25s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 469/594 [1:02:58<19:12, 9.22s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 469/594 [1:02:58<19:12, 9.22s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2623, 'learning_rate': 9.340000000000001e-05, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▍ | 469/594 [1:02:58<19:12, 9.22s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▍ | 469/594 [1:02:58<19:12, 9.22s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 470/594 [1:03:07<18:57, 9.18s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 470/594 [1:03:07<18:57, 9.18s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1637, 'learning_rate': 9.360000000000001e-05, 'epoch': 0.79} 79%|██████████████████████████████████████████████████████████████▌ | 470/594 [1:03:07<18:57, 9.18s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 470/594 [1:03:07<18:57, 9.18s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 470/594 [1:03:07<18:57, 9.18s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 471/594 [1:03:16<18:41, 9.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 471/594 [1:03:16<18:41, 9.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 471/594 [1:03:16<18:41, 9.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 471/594 [1:03:16<18:41, 9.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▋ | 471/594 [1:03:16<18:41, 9.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 472/594 [1:03:25<18:25, 9.06s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 472/594 [1:03:25<18:25, 9.06s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:19:57,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:19:57,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▉ | 473/594 [1:03:34<18:08, 9.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▉ | 473/594 [1:03:34<18:08, 9.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▉ | 473/594 [1:03:34<18:08, 9.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|██████████████████████████████████████████████████████████████▉ | 473/594 [1:03:34<18:08, 9.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 474/594 [1:03:43<17:53, 8.94s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 474/594 [1:03:43<17:53, 8.94s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.111, 'learning_rate': 9.44e-05, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████ | 474/594 [1:03:43<17:53, 8.94s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 474/594 [1:03:43<17:53, 8.94s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 474/594 [1:03:43<17:53, 8.94s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 475/594 [1:03:52<17:59, 9.07s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 475/594 [1:03:52<17:59, 9.07s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 475/594 [1:03:52<17:59, 9.07s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 475/594 [1:03:52<17:59, 9.07s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▏ | 475/594 [1:03:52<17:59, 9.07s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 476/594 [1:04:01<17:38, 8.97s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 476/594 [1:04:01<17:38, 8.97s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 476/594 [1:04:01<17:38, 8.97s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 476/594 [1:04:01<17:38, 8.97s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 477/594 [1:04:09<17:14, 8.84s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 477/594 [1:04:09<17:14, 8.84s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.237, 'learning_rate': 9.5e-05, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▍ | 477/594 [1:04:09<17:14, 8.84s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▍ | 477/594 [1:04:09<17:14, 8.84s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▌ | 478/594 [1:04:18<16:53, 8.74s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▌ | 478/594 [1:04:18<16:53, 8.74s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2458, 'learning_rate': 9.52e-05, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▌ | 478/594 [1:04:18<16:53, 8.74s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▌ | 478/594 [1:04:18<16:53, 8.74s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 479/594 [1:04:27<16:38, 8.69s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 479/594 [1:04:27<16:38, 8.69s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1149, 'learning_rate': 9.54e-05, 'epoch': 0.81} 81%|███████████████████████████████████████████████████████████████▋ | 479/594 [1:04:27<16:38, 8.69s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▋ | 479/594 [1:04:27<16:38, 8.69s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 480/594 [1:04:35<16:21, 8.61s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 480/594 [1:04:35<16:21, 8.61s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.067, 'learning_rate': 9.56e-05, 'epoch': 0.81} 81%|███████████████████████████████████████████████████████████████▊ | 480/594 [1:04:35<16:21, 8.61s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 480/594 [1:04:35<16:21, 8.61s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 481/594 [1:04:43<16:00, 8.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 481/594 [1:04:43<16:00, 8.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4475, 'learning_rate': 9.58e-05, 'epoch': 0.81} 81%|███████████████████████████████████████████████████████████████▉ | 481/594 [1:04:43<16:00, 8.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 481/594 [1:04:43<16:00, 8.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▉ | 481/594 [1:04:43<16:00, 8.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 482/594 [1:04:51<15:41, 8.41s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 482/594 [1:04:51<15:41, 8.41s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 482/594 [1:04:51<15:41, 8.41s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 482/594 [1:04:51<15:41, 8.41s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 482/594 [1:04:51<15:41, 8.41s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▏ | 483/594 [1:05:00<15:26, 8.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▏ | 483/594 [1:05:00<15:26, 8.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▏ | 483/594 [1:05:00<15:26, 8.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▏ | 483/594 [1:05:00<15:26, 8.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▏ | 483/594 [1:05:00<15:26, 8.35s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 484/594 [1:05:07<15:03, 8.21s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 484/594 [1:05:07<15:03, 8.21s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 484/594 [1:05:07<15:03, 8.21s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 484/594 [1:05:07<15:03, 8.21s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 484/594 [1:05:07<15:03, 8.21s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 485/594 [1:05:15<14:45, 8.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 485/594 [1:05:15<14:45, 8.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 485/594 [1:05:15<14:45, 8.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 485/594 [1:05:15<14:45, 8.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▌ | 485/594 [1:05:15<14:45, 8.12s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 486/594 [1:05:23<14:23, 8.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 486/594 [1:05:23<14:23, 8.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 486/594 [1:05:23<14:23, 8.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 486/594 [1:05:23<14:23, 8.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▋ | 486/594 [1:05:23<14:23, 8.00s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▊ | 487/594 [1:05:31<14:02, 7.87s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:00,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:00,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:00,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 488/594 [1:05:38<13:35, 7.70s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 488/594 [1:05:38<13:35, 7.70s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 488/594 [1:05:38<13:35, 7.70s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 488/594 [1:05:38<13:35, 7.70s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|████████████████████████████████████████████████████████████████▉ | 488/594 [1:05:38<13:35, 7.70s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████ | 489/594 [1:05:45<13:07, 7.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:14,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:14,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:14,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████▏ | 490/594 [1:05:52<12:41, 7.32s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████▏ | 490/594 [1:05:52<12:41, 7.32s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:22,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:22,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▎ | 491/594 [1:05:58<12:10, 7.09s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▎ | 491/594 [1:05:58<12:10, 7.09s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:28,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:28,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 492/594 [1:06:05<11:33, 6.80s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 492/594 [1:06:05<11:33, 6.80s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:34,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:34,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▌ | 493/594 [1:06:10<10:56, 6.50s/it]g-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:38,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:38,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:38,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:16:05,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 494/594 [1:06:16<10:16, 6.16s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:45,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:45,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▊ | 495/594 [1:06:21<09:30, 5.76s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:48,564 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:50,601 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:50,601 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:52,640 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:52,640 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:54,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:56,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:56,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:59,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:22:59,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:23:00,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:23:01,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:23:01,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2369] 2022-03-02 03:23:03,714 >> Batch size = 12luation *****e number of tokens of the input, floating-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 0/221 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 2/221 [00:02<04:34, 1.25s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█▏ | 3/221 [00:05<07:12, 1.99s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█▏ | 3/221 [00:05<07:12, 1.99s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█▏ | 3/221 [00:05<07:12, 1.99s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 5/221 [00:11<09:31, 2.64s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 6/221 [00:15<10:16, 2.87s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 7/221 [00:18<11:15, 3.16s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 8/221 [00:21<11:00, 3.10s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 9/221 [00:24<10:54, 3.09s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 9/221 [00:24<10:54, 3.09s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 11/221 [00:33<12:46, 3.65s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▍ | 12/221 [00:36<11:56, 3.43s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 13/221 [00:39<11:32, 3.33s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 14/221 [00:42<11:33, 3.35s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 15/221 [00:47<13:00, 3.79s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 16/221 [00:52<13:54, 4.07s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 17/221 [00:55<13:12, 3.89s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 18/221 [00:59<12:59, 3.84s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 19/221 [01:02<12:17, 3.65s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 20/221 [01:05<11:41, 3.49s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 21/221 [01:08<10:58, 3.29s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 22/221 [01:11<10:44, 3.24s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▌ | 23/221 [01:14<10:30, 3.19s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 24/221 [01:18<11:00, 3.35s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▎ | 25/221 [01:22<11:37, 3.56s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 26/221 [01:26<11:46, 3.63s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|██████████ | 27/221 [01:28<10:46, 3.33s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 28/221 [01:32<11:20, 3.52s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▊ | 29/221 [01:37<12:03, 3.77s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 30/221 [01:40<11:17, 3.55s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▌ | 31/221 [01:42<10:17, 3.25s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▊ | 32/221 [01:46<10:16, 3.26s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▏ | 33/221 [01:49<10:43, 3.42s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▌ | 34/221 [01:53<10:49, 3.47s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 35/221 [01:56<10:21, 3.34s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▎ | 36/221 [01:59<10:12, 3.31s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 37/221 [02:03<11:02, 3.60s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 38/221 [02:06<10:25, 3.42s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 39/221 [02:10<10:42, 3.53s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▊ | 40/221 [02:13<10:08, 3.36s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▏ | 41/221 [02:17<10:21, 3.45s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 42/221 [02:21<11:17, 3.78s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▉ | 43/221 [02:25<10:48, 3.65s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▎ | 44/221 [02:30<11:47, 4.00s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▋ | 45/221 [02:34<12:18, 4.19s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████ | 46/221 [02:38<12:09, 4.17s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████▍ | 47/221 [02:42<11:52, 4.10s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 48/221 [02:46<11:45, 4.08s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|██████████████████▏ | 49/221 [02:50<11:20, 3.95s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 50/221 [02:54<11:19, 3.97s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▉ | 51/221 [02:57<10:37, 3.75s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▎ | 52/221 [03:00<09:56, 3.53s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▋ | 53/221 [03:03<09:23, 3.36s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|████████████████████ | 54/221 [03:07<09:40, 3.48s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▍ | 55/221 [03:11<09:43, 3.52s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▊ | 56/221 [03:15<10:30, 3.82s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▏ | 57/221 [03:19<10:32, 3.86s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▌ | 58/221 [03:22<10:07, 3.73s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▉ | 59/221 [03:26<09:34, 3.54s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████▎ | 60/221 [03:28<08:47, 3.28s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▋ | 61/221 [03:32<08:58, 3.37s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████ | 62/221 [03:35<08:45, 3.31s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 63/221 [03:38<08:50, 3.36s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▋ | 64/221 [03:42<08:47, 3.36s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 65/221 [03:45<08:42, 3.35s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▍ | 66/221 [03:49<08:49, 3.41s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▊ | 67/221 [03:51<08:13, 3.21s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▏ | 68/221 [03:56<08:51, 3.47s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▌ | 69/221 [03:59<08:29, 3.35s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 70/221 [04:02<08:22, 3.33s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|██████████████████████████▎ | 71/221 [04:05<08:15, 3.30s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 72/221 [04:08<07:47, 3.14s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████ | 73/221 [04:11<08:05, 3.28s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|███████████████████████████▍ | 74/221 [04:15<07:59, 3.26s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 75/221 [04:18<07:53, 3.25s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|████████████████████████████▏ | 76/221 [04:21<07:47, 3.23s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▌ | 77/221 [04:24<07:45, 3.23s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▉ | 78/221 [04:28<07:53, 3.31s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 79/221 [04:31<07:36, 3.21s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 79/221 [04:31<07:36, 3.21s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 79/221 [04:31<07:36, 3.21s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 81/221 [04:38<07:57, 3.41s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▍ | 82/221 [04:42<08:28, 3.66s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 83/221 [04:46<08:51, 3.85s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████▏ | 84/221 [04:50<08:53, 3.89s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████▌ | 85/221 [04:55<09:13, 4.07s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 86/221 [04:59<08:55, 3.97s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|████████████████████████████████▎ | 87/221 [05:03<09:14, 4.14s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 88/221 [05:07<08:39, 3.90s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|█████████████████████████████████ | 89/221 [05:10<08:08, 3.70s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▍ | 90/221 [05:13<08:06, 3.71s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▊ | 91/221 [05:18<08:18, 3.84s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▏ | 92/221 [05:22<08:33, 3.98s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▌ | 93/221 [05:26<08:38, 4.05s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 94/221 [05:30<08:22, 3.96s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 95/221 [05:34<08:20, 3.97s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▌ | 96/221 [05:38<08:07, 3.90s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 97/221 [05:42<08:20, 4.03s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████▎ | 98/221 [05:46<08:08, 3.97s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▋ | 99/221 [05:49<07:20, 3.61s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▋ | 100/221 [05:52<07:18, 3.62s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 101/221 [05:55<06:57, 3.48s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 102/221 [05:58<06:35, 3.32s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▊ | 103/221 [06:02<06:52, 3.49s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████ | 104/221 [06:06<07:06, 3.64s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▍ | 105/221 [06:11<07:27, 3.86s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▊ | 106/221 [06:14<07:25, 3.87s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▏ | 107/221 [06:18<06:55, 3.64s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▌ | 108/221 [06:22<07:12, 3.82s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 109/221 [06:26<07:12, 3.86s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▎ | 110/221 [06:29<06:50, 3.70s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▋ | 111/221 [06:32<06:31, 3.55s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 112/221 [06:36<06:35, 3.63s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 113/221 [06:40<06:29, 3.61s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 114/221 [06:43<06:14, 3.50s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▏ | 115/221 [06:46<06:10, 3.49s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▌ | 116/221 [06:50<05:54, 3.38s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▉ | 117/221 [06:53<05:47, 3.34s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▏ | 118/221 [06:57<05:57, 3.47s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 119/221 [07:01<06:18, 3.71s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 120/221 [07:05<06:33, 3.90s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▎ | 121/221 [07:09<06:21, 3.82s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 122/221 [07:11<05:35, 3.39s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 123/221 [07:14<05:03, 3.09s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 123/221 [07:14<05:03, 3.09s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 123/221 [07:14<05:03, 3.09s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 125/221 [07:20<05:16, 3.30s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▏ | 126/221 [07:23<04:57, 3.14s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▌ | 127/221 [07:26<04:44, 3.02s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 128/221 [07:29<04:28, 2.89s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▎ | 129/221 [07:32<04:48, 3.14s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 130/221 [07:35<04:33, 3.01s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 131/221 [07:39<04:48, 3.21s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 132/221 [07:41<04:29, 3.03s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 133/221 [07:44<04:29, 3.07s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████ | 134/221 [07:47<04:19, 2.98s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 135/221 [07:50<04:22, 3.06s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▊ | 136/221 [07:54<04:37, 3.27s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 137/221 [07:58<04:40, 3.34s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▌ | 138/221 [08:02<04:50, 3.50s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 139/221 [08:05<04:52, 3.57s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▎ | 140/221 [08:08<04:22, 3.24s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 141/221 [08:11<04:17, 3.22s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 142/221 [08:14<04:09, 3.16s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▍ | 143/221 [08:16<03:49, 2.95s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 144/221 [08:20<04:06, 3.20s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▏ | 145/221 [08:23<04:00, 3.17s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▌ | 146/221 [08:27<04:09, 3.33s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|█████████████████████████████████████████████████████▉ | 147/221 [08:30<03:54, 3.18s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▏ | 148/221 [08:33<03:54, 3.22s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 149/221 [08:36<03:54, 3.26s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▉ | 150/221 [08:40<03:51, 3.26s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 151/221 [08:43<03:58, 3.40s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 152/221 [08:47<03:50, 3.34s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 153/221 [08:50<03:43, 3.29s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 154/221 [08:53<03:44, 3.36s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 155/221 [08:57<03:42, 3.37s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▏ | 156/221 [09:00<03:42, 3.42s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 157/221 [09:03<03:24, 3.20s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▉ | 158/221 [09:08<03:48, 3.62s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 159/221 [09:11<03:47, 3.67s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 160/221 [09:15<03:51, 3.80s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████ | 161/221 [09:20<03:54, 3.91s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▍ | 162/221 [09:24<03:50, 3.91s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 163/221 [09:28<03:54, 4.04s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████ | 164/221 [09:32<03:59, 4.20s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 165/221 [09:36<03:45, 4.02s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▊ | 166/221 [09:39<03:25, 3.74s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▏ | 167/221 [09:43<03:18, 3.67s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▌ | 168/221 [09:46<03:05, 3.50s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▉ | 169/221 [09:50<03:07, 3.61s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▎ | 170/221 [09:53<03:07, 3.67s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▋ | 171/221 [09:57<03:04, 3.69s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████ | 172/221 [10:00<02:54, 3.57s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████▍ | 173/221 [10:04<02:50, 3.55s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|███████████████████████████████████████████████████████████████▊ | 174/221 [10:07<02:42, 3.45s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████████▏ | 175/221 [10:10<02:34, 3.37s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▌ | 176/221 [10:14<02:37, 3.51s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▊ | 177/221 [10:17<02:27, 3.35s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▏ | 178/221 [10:21<02:29, 3.48s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▏ | 178/221 [10:21<02:29, 3.48s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▏ | 178/221 [10:21<02:29, 3.48s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▉ | 180/221 [10:28<02:29, 3.66s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▎ | 181/221 [10:32<02:29, 3.75s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▋ | 182/221 [10:36<02:24, 3.72s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████ | 183/221 [10:40<02:27, 3.88s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████▍ | 184/221 [10:44<02:22, 3.84s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|███████████████████████████████████████████████████████████████████▊ | 185/221 [10:47<02:11, 3.66s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████████▏ | 186/221 [10:52<02:16, 3.89s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▌ | 187/221 [10:55<02:07, 3.75s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▉ | 188/221 [10:59<02:07, 3.87s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▎ | 189/221 [11:03<02:06, 3.94s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▋ | 190/221 [11:08<02:06, 4.08s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████████ | 191/221 [11:12<02:07, 4.24s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▎ | 192/221 [11:17<02:03, 4.25s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▋ | 193/221 [11:20<01:49, 3.93s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████ | 194/221 [11:23<01:39, 3.67s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████▍ | 195/221 [11:26<01:30, 3.48s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|███████████████████████████████████████████████████████████████████████▊ | 196/221 [11:29<01:25, 3.42s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████████▏ | 197/221 [11:32<01:17, 3.25s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▌ | 198/221 [11:36<01:20, 3.52s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▉ | 199/221 [11:41<01:23, 3.78s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████████▎ | 200/221 [11:44<01:16, 3.66s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▋ | 201/221 [11:47<01:11, 3.57s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████████ | 202/221 [11:50<01:03, 3.36s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▍ | 203/221 [11:54<01:02, 3.45s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▊ | 204/221 [11:58<01:03, 3.75s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▏ | 205/221 [12:03<01:05, 4.06s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▌ | 206/221 [12:08<01:03, 4.23s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|███████████████████████████████████████████████████████████████████████████▊ | 207/221 [12:11<00:55, 3.98s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████████▏ | 208/221 [12:15<00:50, 3.92s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▌ | 209/221 [12:18<00:44, 3.72s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▉ | 210/221 [12:22<00:41, 3.79s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████████▎ | 211/221 [12:27<00:39, 3.98s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████████▋ | 212/221 [12:30<00:35, 3.89s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████████ | 213/221 [12:33<00:29, 3.63s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▍ | 214/221 [12:37<00:25, 3.60s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▊ | 215/221 [12:41<00:22, 3.78s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▏ | 216/221 [12:45<00:19, 3.89s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▌ | 217/221 [12:49<00:15, 3.88s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|███████████████████████████████████████████████████████████████████████████████▉ | 218/221 [12:53<00:11, 3.87s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████▎| 219/221 [12:57<00:07, 3.85s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████████▋| 220/221 [13:01<00:04, 4.05s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|█████████████████████████████████████████████████████████████████████████████████| 221/221 [13:03<00:00, 3.39s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|█████████████████████████████████████████████████████████████████████████████████| 221/221 [13:03<00:00, 3.39s/it]g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 03/02/2022 03:36:10 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow [INFO|configuration_utils.py:438] 2022-03-02 03:36:10,584 >> Configuration saved in ./checkpoint-500/config.json g-point operations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 03/02/2022 03:37:44 - WARNING - huggingface_hub.repository - Adding files tracked by Git LFS: ['wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb']. This may take a bit of time if the files are large. [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:36:15,658 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████ | 501/594 [1:21:57<7:11:32, 278.41s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████ | 501/594 [1:21:57<7:11:32, 278.41s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████ | 501/594 [1:21:57<7:11:32, 278.41s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████ | 501/594 [1:21:57<7:11:32, 278.41s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████ | 501/594 [1:21:57<7:11:32, 278.41s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▏ | 502/594 [1:22:08<5:03:37, 198.01s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▏ | 502/594 [1:22:08<5:03:37, 198.01s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▏ | 502/594 [1:22:08<5:03:37, 198.01s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▏ | 502/594 [1:22:08<5:03:37, 198.01s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▏ | 502/594 [1:22:08<5:03:37, 198.01s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▎ | 503/594 [1:22:18<3:34:56, 141.72s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▎ | 503/594 [1:22:18<3:34:56, 141.72s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▎ | 503/594 [1:22:18<3:34:56, 141.72s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▎ | 503/594 [1:22:18<3:34:56, 141.72s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▎ | 503/594 [1:22:18<3:34:56, 141.72s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2746, 'learning_rate': 9.680851063829788e-05, 'epoch': 0.85} 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████▍ | 504/594 [1:22:28<2:33:23, 102.26s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▌ | 506/594 [1:22:49<1:21:06, 55.30s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▌ | 506/594 [1:22:49<1:21:06, 55.30s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3589, 'learning_rate': 9.574468085106384e-05, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▌ | 506/594 [1:22:49<1:21:06, 55.30s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▌ | 506/594 [1:22:49<1:21:06, 55.30s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▋ | 507/594 [1:22:59<1:00:33, 41.77s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▋ | 507/594 [1:22:59<1:00:33, 41.77s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2035, 'learning_rate': 9.468085106382978e-05, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▋ | 507/594 [1:22:59<1:00:33, 41.77s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▋ | 507/594 [1:22:59<1:00:33, 41.77s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 508/594 [1:23:09<46:13, 32.25s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 508/594 [1:23:09<46:13, 32.25s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2748, 'learning_rate': 9.361702127659576e-05, 'epoch': 0.85} 86%|███████████████████████████████████████████████████████████████████▌ | 508/594 [1:23:09<46:13, 32.25s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 508/594 [1:23:09<46:13, 32.25s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 509/594 [1:23:19<36:11, 25.55s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 509/594 [1:23:19<36:11, 25.55s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1227, 'learning_rate': 9.25531914893617e-05, 'epoch': 0.86} 86%|███████████████████████████████████████████████████████████████████▋ | 509/594 [1:23:19<36:11, 25.55s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 509/594 [1:23:19<36:11, 25.55s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▋ | 509/594 [1:23:19<36:11, 25.55s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 510/594 [1:23:29<29:09, 20.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 510/594 [1:23:29<29:09, 20.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 510/594 [1:23:29<29:09, 20.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 510/594 [1:23:29<29:09, 20.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 510/594 [1:23:29<29:09, 20.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 511/594 [1:23:39<24:13, 17.51s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 511/594 [1:23:39<24:13, 17.51s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 511/594 [1:23:39<24:13, 17.51s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▉ | 511/594 [1:23:39<24:13, 17.51s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 512/594 [1:23:48<20:45, 15.19s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 512/594 [1:23:48<20:45, 15.19s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1754, 'learning_rate': 8.936170212765958e-05, 'epoch': 0.86} 86%|████████████████████████████████████████████████████████████████████ | 512/594 [1:23:48<20:45, 15.19s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 512/594 [1:23:48<20:45, 15.19s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 512/594 [1:23:48<20:45, 15.19s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 513/594 [1:23:58<18:16, 13.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 513/594 [1:23:58<18:16, 13.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 513/594 [1:23:58<18:16, 13.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 513/594 [1:23:58<18:16, 13.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████▏ | 513/594 [1:23:58<18:16, 13.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 514/594 [1:24:08<16:29, 12.37s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 514/594 [1:24:08<16:29, 12.37s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 514/594 [1:24:08<16:29, 12.37s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 514/594 [1:24:08<16:29, 12.37s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 514/594 [1:24:08<16:29, 12.37s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 515/594 [1:24:17<15:10, 11.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 515/594 [1:24:17<15:10, 11.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 515/594 [1:24:17<15:10, 11.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▍ | 515/594 [1:24:17<15:10, 11.53s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 516/594 [1:24:26<14:04, 10.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 516/594 [1:24:26<14:04, 10.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2799, 'learning_rate': 8.510638297872341e-05, 'epoch': 0.87} 87%|████████████████████████████████████████████████████████████████████▋ | 516/594 [1:24:26<14:04, 10.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 516/594 [1:24:26<14:04, 10.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 516/594 [1:24:26<14:04, 10.83s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▊ | 517/594 [1:24:36<13:17, 10.36s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▊ | 517/594 [1:24:36<13:17, 10.36s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▊ | 517/594 [1:24:36<13:17, 10.36s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▊ | 517/594 [1:24:36<13:17, 10.36s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▊ | 517/594 [1:24:36<13:17, 10.36s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 518/594 [1:24:45<12:40, 10.00s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 518/594 [1:24:45<12:40, 10.00s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 518/594 [1:24:45<12:40, 10.00s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 518/594 [1:24:45<12:40, 10.00s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 518/594 [1:24:45<12:40, 10.00s/it]onfig.jsonerations will not be computed-02 03:22:42,712 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|█████████████████████████████████████████████████████████████████████ | 519/594 [1:24:54<12:11, 9.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|█████████████████████████████████████████████████████████████████████ | 519/594 [1:24:54<12:11, 9.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|█████████████████████████████████████████████████████████████████████ | 519/594 [1:24:54<12:11, 9.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|█████████████████████████████████████████████████████████████████████ | 519/594 [1:24:54<12:11, 9.75s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 520/594 [1:25:03<11:47, 9.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 520/594 [1:25:03<11:47, 9.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 520/594 [1:25:03<11:47, 9.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 520/594 [1:25:03<11:47, 9.56s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▎ | 521/594 [1:25:12<11:28, 9.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▎ | 521/594 [1:25:12<11:28, 9.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2368, 'learning_rate': 7.978723404255319e-05, 'epoch': 0.88} 88%|█████████████████████████████████████████████████████████████████████▎ | 521/594 [1:25:12<11:28, 9.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▎ | 521/594 [1:25:12<11:28, 9.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▎ | 521/594 [1:25:12<11:28, 9.43s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 522/594 [1:25:21<11:10, 9.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 522/594 [1:25:21<11:10, 9.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 522/594 [1:25:21<11:10, 9.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 522/594 [1:25:21<11:10, 9.31s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▌ | 523/594 [1:25:30<10:52, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▌ | 523/594 [1:25:30<10:52, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1364, 'learning_rate': 7.76595744680851e-05, 'epoch': 0.88} 88%|█████████████████████████████████████████████████████████████████████▌ | 523/594 [1:25:30<10:52, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▌ | 523/594 [1:25:30<10:52, 9.19s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 524/594 [1:25:39<10:37, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 524/594 [1:25:39<10:37, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9837, 'learning_rate': 7.659574468085106e-05, 'epoch': 0.88} 88%|█████████████████████████████████████████████████████████████████████▋ | 524/594 [1:25:39<10:37, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 524/594 [1:25:39<10:37, 9.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▊ | 525/594 [1:25:49<10:33, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▊ | 525/594 [1:25:49<10:33, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0987, 'learning_rate': 7.553191489361703e-05, 'epoch': 0.88} 88%|█████████████████████████████████████████████████████████████████████▊ | 525/594 [1:25:49<10:33, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▊ | 525/594 [1:25:49<10:33, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▊ | 525/594 [1:25:49<10:33, 9.18s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 526/594 [1:25:57<10:15, 9.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 526/594 [1:25:57<10:15, 9.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 526/594 [1:25:57<10:15, 9.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 526/594 [1:25:57<10:15, 9.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 526/594 [1:25:57<10:15, 9.05s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████ | 527/594 [1:26:06<09:59, 8.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████ | 527/594 [1:26:06<09:59, 8.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████ | 527/594 [1:26:06<09:59, 8.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████ | 527/594 [1:26:06<09:59, 8.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████ | 527/594 [1:26:06<09:59, 8.94s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 528/594 [1:26:14<09:41, 8.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 528/594 [1:26:14<09:41, 8.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 528/594 [1:26:14<09:41, 8.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 528/594 [1:26:14<09:41, 8.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▎ | 529/594 [1:26:23<09:26, 8.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▎ | 529/594 [1:26:23<09:26, 8.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0937, 'learning_rate': 7.127659574468085e-05, 'epoch': 0.89} 89%|██████████████████████████████████████████████████████████████████████▎ | 529/594 [1:26:23<09:26, 8.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▎ | 529/594 [1:26:23<09:26, 8.72s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 530/594 [1:26:31<09:12, 8.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 530/594 [1:26:31<09:12, 8.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0465, 'learning_rate': 7.021276595744681e-05, 'epoch': 0.89} 89%|██████████████████████████████████████████████████████████████████████▍ | 530/594 [1:26:31<09:12, 8.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 530/594 [1:26:31<09:12, 8.63s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▌ | 531/594 [1:26:40<08:58, 8.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▌ | 531/594 [1:26:40<08:58, 8.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0592, 'learning_rate': 6.914893617021277e-05, 'epoch': 0.89} 89%|██████████████████████████████████████████████████████████████████████▌ | 531/594 [1:26:40<08:58, 8.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▌ | 531/594 [1:26:40<08:58, 8.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▌ | 531/594 [1:26:40<08:58, 8.55s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 532/594 [1:26:48<08:46, 8.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 532/594 [1:26:48<08:46, 8.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 532/594 [1:26:48<08:46, 8.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 532/594 [1:26:48<08:46, 8.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 532/594 [1:26:48<08:46, 8.49s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▉ | 533/594 [1:26:56<08:32, 8.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▉ | 533/594 [1:26:56<08:32, 8.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▉ | 533/594 [1:26:56<08:32, 8.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▉ | 533/594 [1:26:56<08:32, 8.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▉ | 533/594 [1:26:56<08:32, 8.40s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 534/594 [1:27:04<08:13, 8.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 534/594 [1:27:04<08:13, 8.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 534/594 [1:27:04<08:13, 8.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 534/594 [1:27:04<08:13, 8.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 534/594 [1:27:04<08:13, 8.23s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▏ | 535/594 [1:27:12<07:58, 8.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▏ | 535/594 [1:27:12<07:58, 8.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▏ | 535/594 [1:27:12<07:58, 8.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▏ | 535/594 [1:27:12<07:58, 8.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▏ | 535/594 [1:27:12<07:58, 8.11s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 536/594 [1:27:20<07:42, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 536/594 [1:27:20<07:42, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 536/594 [1:27:20<07:42, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 536/594 [1:27:20<07:42, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 536/594 [1:27:20<07:42, 7.97s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:41:22,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▍ | 537/594 [1:27:27<07:25, 7.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▍ | 537/594 [1:27:27<07:25, 7.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▍ | 537/594 [1:27:27<07:25, 7.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▍ | 537/594 [1:27:27<07:25, 7.82s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 538/594 [1:27:34<07:09, 7.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 538/594 [1:27:34<07:09, 7.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 538/594 [1:27:34<07:09, 7.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 538/594 [1:27:34<07:09, 7.66s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:06,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:06,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:06,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:06,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:06,936 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 540/594 [1:27:48<06:32, 7.27s/it]g-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:17,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:17,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:17,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▉ | 541/594 [1:27:54<06:11, 7.00s/it]g-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:23,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:23,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:23,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 542/594 [1:28:01<05:50, 6.73s/it]g-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:29,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:29,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:29,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:43:54,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████▏ | 543/594 [1:28:06<05:26, 6.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████▏ | 543/594 [1:28:06<05:26, 6.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████▏ | 543/594 [1:28:06<05:26, 6.41s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:37,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:39,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:41,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:41,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:43,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:45,767 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:45,767 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:47,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:47,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:49,466 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:51,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:51,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:52,746 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:55,615 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:55,615 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:56,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:56,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:58,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:44:58,502 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:45:03,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:45:03,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:45:03,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▎ | 551/594 [1:28:43<04:03, 5.65s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▎ | 551/594 [1:28:43<04:03, 5.65s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▎ | 551/594 [1:28:43<04:03, 5.65s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▎ | 551/594 [1:28:43<04:03, 5.65s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▎ | 551/594 [1:28:43<04:03, 5.65s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2371, 'learning_rate': 4.574468085106383e-05, 'epoch': 0.93} 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 552/594 [1:28:53<04:52, 6.96s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 554/594 [1:29:13<05:39, 8.48s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 554/594 [1:29:13<05:39, 8.48s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0849, 'learning_rate': 4.468085106382979e-05, 'epoch': 0.93} 93%|█████████████████████████████████████████████████████████████████████████▋ | 554/594 [1:29:13<05:39, 8.48s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 554/594 [1:29:13<05:39, 8.48s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▊ | 555/594 [1:29:23<05:45, 8.87s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▊ | 555/594 [1:29:23<05:45, 8.87s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1186, 'learning_rate': 4.3617021276595746e-05, 'epoch': 0.93} 93%|█████████████████████████████████████████████████████████████████████████▊ | 555/594 [1:29:23<05:45, 8.87s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▊ | 555/594 [1:29:23<05:45, 8.87s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1655, 'learning_rate': 4.2553191489361704e-05, 'epoch': 0.93} 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2651, 'learning_rate': 4.148936170212766e-05, 'epoch': 0.94} 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 556/594 [1:29:33<05:47, 9.15s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 558/594 [1:29:52<05:38, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 558/594 [1:29:52<05:38, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 558/594 [1:29:52<05:38, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 558/594 [1:29:52<05:38, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 558/594 [1:29:52<05:38, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0671, 'learning_rate': 3.829787234042553e-05, 'epoch': 0.94} 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▎ | 559/594 [1:30:02<05:31, 9.47s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1243, 'learning_rate': 3.617021276595745e-05, 'epoch': 0.94} 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▌ | 561/594 [1:30:21<05:11, 9.44s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▉ | 563/594 [1:30:39<04:51, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▉ | 563/594 [1:30:39<04:51, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0769, 'learning_rate': 3.5106382978723407e-05, 'epoch': 0.95} 95%|██████████████████████████████████████████████████████████████████████████▉ | 563/594 [1:30:39<04:51, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▉ | 563/594 [1:30:39<04:51, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▉ | 563/594 [1:30:39<04:51, 9.40s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 564/594 [1:30:49<04:39, 9.33s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 564/594 [1:30:49<04:39, 9.33s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 564/594 [1:30:49<04:39, 9.33s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 564/594 [1:30:49<04:39, 9.33s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▏ | 565/594 [1:30:58<04:29, 9.28s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▏ | 565/594 [1:30:58<04:29, 9.28s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1857, 'learning_rate': 3.2978723404255317e-05, 'epoch': 0.95} 95%|███████████████████████████████████████████████████████████████████████████▏ | 565/594 [1:30:58<04:29, 9.28s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▏ | 565/594 [1:30:58<04:29, 9.28s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▏ | 565/594 [1:30:58<04:29, 9.28s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▎ | 566/594 [1:31:07<04:18, 9.23s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▎ | 566/594 [1:31:07<04:18, 9.23s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▎ | 566/594 [1:31:07<04:18, 9.23s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▎ | 566/594 [1:31:07<04:18, 9.23s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▍ | 567/594 [1:31:16<04:07, 9.18s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▍ | 567/594 [1:31:16<04:07, 9.18s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2188, 'learning_rate': 3.085106382978723e-05, 'epoch': 0.95} 95%|███████████████████████████████████████████████████████████████████████████▍ | 567/594 [1:31:16<04:07, 9.18s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▍ | 567/594 [1:31:16<04:07, 9.18s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▌ | 568/594 [1:31:25<03:58, 9.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▌ | 568/594 [1:31:25<03:58, 9.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.103, 'learning_rate': 2.9787234042553192e-05, 'epoch': 0.96} 96%|███████████████████████████████████████████████████████████████████████████▌ | 568/594 [1:31:25<03:58, 9.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▌ | 568/594 [1:31:25<03:58, 9.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▋ | 569/594 [1:31:34<03:48, 9.12s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▋ | 569/594 [1:31:34<03:48, 9.12s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1005, 'learning_rate': 2.8723404255319154e-05, 'epoch': 0.96} 96%|███████████████████████████████████████████████████████████████████████████▋ | 569/594 [1:31:34<03:48, 9.12s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▋ | 569/594 [1:31:34<03:48, 9.12s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▊ | 570/594 [1:31:43<03:37, 9.06s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▊ | 570/594 [1:31:43<03:37, 9.06s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1211, 'learning_rate': 2.765957446808511e-05, 'epoch': 0.96} 96%|███████████████████████████████████████████████████████████████████████████▊ | 570/594 [1:31:43<03:37, 9.06s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▊ | 570/594 [1:31:43<03:37, 9.06s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▉ | 571/594 [1:31:52<03:27, 9.02s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▉ | 571/594 [1:31:52<03:27, 9.02s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3098, 'learning_rate': 2.6595744680851064e-05, 'epoch': 0.96} 96%|███████████████████████████████████████████████████████████████████████████▉ | 571/594 [1:31:52<03:27, 9.02s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▉ | 571/594 [1:31:52<03:27, 9.02s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▉ | 571/594 [1:31:52<03:27, 9.02s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████ | 572/594 [1:32:01<03:16, 8.93s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████ | 572/594 [1:32:01<03:16, 8.93s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████ | 572/594 [1:32:01<03:16, 8.93s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████ | 572/594 [1:32:01<03:16, 8.93s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████ | 572/594 [1:32:01<03:16, 8.93s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████▏ | 573/594 [1:32:09<03:05, 8.83s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████▏ | 573/594 [1:32:09<03:05, 8.83s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████▏ | 573/594 [1:32:09<03:05, 8.83s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████▏ | 573/594 [1:32:09<03:05, 8.83s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████▏ | 573/594 [1:32:09<03:05, 8.83s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▎ | 574/594 [1:32:18<02:55, 8.75s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▎ | 574/594 [1:32:18<02:55, 8.75s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▎ | 574/594 [1:32:18<02:55, 8.75s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▎ | 574/594 [1:32:18<02:55, 8.75s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▍ | 575/594 [1:32:27<02:47, 8.84s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▍ | 575/594 [1:32:27<02:47, 8.84s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2215, 'learning_rate': 2.2340425531914894e-05, 'epoch': 0.97} 97%|████████████████████████████████████████████████████████████████████████████▍ | 575/594 [1:32:27<02:47, 8.84s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▍ | 575/594 [1:32:27<02:47, 8.84s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▌ | 576/594 [1:32:35<02:36, 8.71s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▌ | 576/594 [1:32:35<02:36, 8.71s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1775, 'learning_rate': 2.1276595744680852e-05, 'epoch': 0.97} 97%|████████████████████████████████████████████████████████████████████████████▌ | 576/594 [1:32:35<02:36, 8.71s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▌ | 576/594 [1:32:35<02:36, 8.71s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▋ | 577/594 [1:32:44<02:26, 8.61s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▋ | 577/594 [1:32:44<02:26, 8.61s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2561, 'learning_rate': 2.0212765957446807e-05, 'epoch': 0.97} 97%|████████████████████████████████████████████████████████████████████████████▋ | 577/594 [1:32:44<02:26, 8.61s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▋ | 577/594 [1:32:44<02:26, 8.61s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▊ | 578/594 [1:32:52<02:16, 8.55s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▊ | 578/594 [1:32:52<02:16, 8.55s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0031, 'learning_rate': 1.9148936170212766e-05, 'epoch': 0.97} 97%|████████████████████████████████████████████████████████████████████████████▊ | 578/594 [1:32:52<02:16, 8.55s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▊ | 578/594 [1:32:52<02:16, 8.55s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|█████████████████████████████████████████████████████████████████████████████ | 579/594 [1:33:00<02:06, 8.46s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|█████████████████████████████████████████████████████████████████████████████ | 579/594 [1:33:00<02:06, 8.46s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9507, 'learning_rate': 1.8085106382978724e-05, 'epoch': 0.97} 97%|█████████████████████████████████████████████████████████████████████████████ | 579/594 [1:33:00<02:06, 8.46s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|█████████████████████████████████████████████████████████████████████████████ | 579/594 [1:33:00<02:06, 8.46s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▏ | 580/594 [1:33:08<01:56, 8.34s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▏ | 580/594 [1:33:08<01:56, 8.34s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2023, 'learning_rate': 1.7021276595744682e-05, 'epoch': 0.98} 98%|█████████████████████████████████████████████████████████████████████████████▏ | 580/594 [1:33:08<01:56, 8.34s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▏ | 580/594 [1:33:08<01:56, 8.34s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▎ | 581/594 [1:33:16<01:46, 8.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▎ | 581/594 [1:33:16<01:46, 8.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3914, 'learning_rate': 1.595744680851064e-05, 'epoch': 0.98} 98%|█████████████████████████████████████████████████████████████████████████████▎ | 581/594 [1:33:16<01:46, 8.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▎ | 581/594 [1:33:16<01:46, 8.17s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▍ | 582/594 [1:33:24<01:36, 8.01s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▍ | 582/594 [1:33:24<01:36, 8.01s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1952, 'learning_rate': 1.4893617021276596e-05, 'epoch': 0.98} 98%|█████████████████████████████████████████████████████████████████████████████▍ | 582/594 [1:33:24<01:36, 8.01s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▍ | 582/594 [1:33:24<01:36, 8.01s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▍ | 582/594 [1:33:24<01:36, 8.01s/it]g-point operations will not be computed-02 03:44:33,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▌ | 583/594 [1:33:31<01:26, 7.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▌ | 583/594 [1:33:31<01:26, 7.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▌ | 583/594 [1:33:31<01:26, 7.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▌ | 583/594 [1:33:31<01:26, 7.86s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▋ | 584/594 [1:33:38<01:16, 7.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▋ | 584/594 [1:33:38<01:16, 7.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▋ | 584/594 [1:33:38<01:16, 7.68s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:10,999 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:10,999 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1881, 'learning_rate': 1.170212765957447e-05, 'epoch': 0.98} [WARNING|modeling_utils.py:388] 2022-03-02 03:50:10,999 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:10,999 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:49:58,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|█████████████████████████████████████████████████████████████████████████████▉ | 586/594 [1:33:52<00:56, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:50:19,018 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|█████████████████████████████████████████████████████████████████████████████▉ | 586/594 [1:33:52<00:56, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:50:19,018 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0666, 'learning_rate': 1.0638297872340426e-05, 'epoch': 0.99} 99%|█████████████████████████████████████████████████████████████████████████████▉ | 586/594 [1:33:52<00:56, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:50:19,018 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|█████████████████████████████████████████████████████████████████████████████▉ | 586/594 [1:33:52<00:56, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:50:19,018 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|██████████████████████████████████████████████████████████████████████████████ | 587/594 [1:33:58<00:47, 6.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|██████████████████████████████████████████████████████████████████████████████ | 587/594 [1:33:58<00:47, 6.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|██████████████████████████████████████████████████████████████████████████████ | 587/594 [1:33:58<00:47, 6.81s/it][WARNING|modeling_utils.py:388] 2022-03-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:29,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:29,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:32,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:32,993 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|██████████████████████████████████████████████████████████████████████████████▎| 589/594 [1:34:09<00:30, 6.06s/it]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:36,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:36,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:38,860 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:41,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:41,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:42,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:44,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:44,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:46,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:48,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:48,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:51,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-02 03:50:51,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2767, 'learning_rate': 2.1276595744680853e-06, 'epoch': 1.0} [INFO|configuration_utils.py:438] 2022-03-02 03:50:52,608 >> Configuration saved in ./config.json34:27<00:00, 9.54s/it]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|configuration_utils.py:438] 2022-03-02 03:51:09,060 >> Configuration saved in ./config.jsoncessor_config.jsons/it]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-02 03:51:25,400 >> Configuration saved in ./preprocessor_config.jsons/it]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 0%| | 32.0k/34.3M [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 73%|████████ | 25.1M/34.3M [00:02<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 73%|████████ | 25.1M/34.3M [00:02<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 73%|████████ | 25.1M/34.3M [00:02<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 73%|████████ | 25.1M/34.3M [00:02<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 73%|████████ | 25.1M/34.3M [00:02<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 73%|████████ | 25.1M/34.3M [00:02<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|███████████| 34.3M/34.3M [00:14<00:00, 13.5MB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 03/02/2022 03:55:32 - WARNING - huggingface_hub.repository - To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-search Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 100%|████████████| 34.3M/34.3M [02:33<00:00, 169kB/s]g-point operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|modelcard.py:460] 2022-03-02 03:55:35,642 >> Dropping the following result as it does not have all the necessary fields:t operations will not be computed-02 03:50:24,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 0%| | 32.0k/34.3M [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 0%| | 32.0k/34.3M [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed 03/02/2022 03:55:40 - WARNING - huggingface_hub.repository - To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-search Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 52%|█████▊ | 17.9M/34.3M [00:01<00:00, 18.7MB/s]To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-searchimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2366] 2022-03-02 03:55:43,893 >> Num examples = 2642in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. [INFO|trainer.py:2366] 2022-03-02 03:55:43,893 >> Num examples = 2642in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. ***** train metrics ***** epoch = 1.0 train_loss = 4.2775 train_runtime = 1:34:28.95 train_samples = 28538 train_samples_per_second = 5.034 train_steps_per_second = 0.105 0%| | 0/221 [00:00> Saving model checkpoint to ./ | 3/221 [00:06<08:49, 2.43s/it] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. [INFO|modeling_utils.py:1081] 2022-03-02 04:11:32,644 >> Model weights saved in ./pytorch_model.bin:06<08:49, 2.43s/it] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. Upload file wandb/run-20220302_021624-vszekdxg/run-vszekdxg.wandb: 0%| | 32.0k/34.4M [00:00ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. return ModelInfo(**d)f.finetuned_from)formers/src/transformers/modelcard.py", line 611, in from_trainercard31, in mainule>ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. return ModelInfo(**d)f.finetuned_from)formers/src/transformers/modelcard.py", line 611, in from_trainercard31, in mainule>ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message.