0%| | 0/1784 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8979, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:11,240 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 1/1784 [00:04<2:09:15, 4.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:13,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0308, 'learning_rate': 0.0, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:15,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 2/1784 [00:08<1:59:24, 4.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:16,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:37:18,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.1394, 'learning_rate': 0.0, 'epoch': 0.0} 0%|▏ | 3/1784 [00:11<1:56:53, 3.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:20,727 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.684, 'learning_rate': 2e-08, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:22,506 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▏ | 4/1784 [00:15<1:55:06, 3.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:24,563 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7728, 'learning_rate': 4e-08, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:26,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▏ | 5/1784 [00:19<1:53:59, 3.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:28,305 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8141, 'learning_rate': 6.000000000000001e-08, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:30,137 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 6/1784 [00:23<1:52:56, 3.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:32,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6878, 'learning_rate': 8e-08, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:33,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 7/1784 [00:27<1:51:50, 3.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:35,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8455, 'learning_rate': 1.0000000000000001e-07, 'epoch': 0.0} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:37,632 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 8/1784 [00:30<1:51:54, 3.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:39,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:37:41,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▍ | 9/1784 [00:34<1:50:37, 3.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:43,149 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0892, 'learning_rate': 1.2000000000000002e-07, 'epoch': 0.01} {'loss': 4.7674, 'learning_rate': 1.4e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:44,894 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▍ | 10/1784 [00:38<1:49:25, 3.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:46,750 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8139, 'learning_rate': 1.6e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:48,485 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▍ | 11/1784 [00:41<1:48:19, 3.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:50,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7889, 'learning_rate': 1.8e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:52,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 12/1784 [00:45<1:47:15, 3.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:53,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0036, 'learning_rate': 2.0000000000000002e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:55,638 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 13/1784 [00:48<1:46:54, 3.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:37:57,447 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.621, 'learning_rate': 2.2e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:37:59,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 14/1784 [00:52<1:45:34, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:00,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7464, 'learning_rate': 2.4000000000000003e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:02,605 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▋ | 15/1784 [00:55<1:44:43, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:04,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:38:06,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8966, 'learning_rate': 2.6e-07, 'epoch': 0.01} 1%|▋ | 16/1784 [00:59<1:43:53, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:07,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9026, 'learning_rate': 2.8e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:09,497 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 17/1784 [01:02<1:42:58, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:11,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8133, 'learning_rate': 3.0000000000000004e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:12,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 18/1784 [01:06<1:41:59, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:14,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:38:16,294 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7758, 'learning_rate': 3.2e-07, 'epoch': 0.01} 1%|▊ | 19/1784 [01:09<1:41:23, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:18,014 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0714, 'learning_rate': 3.4000000000000003e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:19,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▉ | 20/1784 [01:12<1:40:42, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:21,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7965, 'learning_rate': 3.6e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:23,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▉ | 21/1784 [01:16<1:40:16, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:24,810 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.1218, 'learning_rate': 3.8e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:26,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▉ | 22/1784 [01:19<1:39:59, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:28,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9747, 'learning_rate': 4.0000000000000003e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:29,792 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 23/1784 [01:22<1:39:27, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:31,503 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7182, 'learning_rate': 4.2000000000000006e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:33,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 24/1784 [01:26<1:38:27, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:34,794 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:38:36,358 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9117, 'learning_rate': 4.4e-07, 'epoch': 0.01} 1%|█ | 25/1784 [01:29<1:37:46, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:38,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.0119, 'learning_rate': 4.6000000000000004e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:39,622 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█▏ | 26/1784 [01:32<1:37:05, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:41,277 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.849, 'learning_rate': 4.800000000000001e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:42,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▏ | 27/1784 [01:36<1:36:19, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:44,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5669, 'learning_rate': 5.000000000000001e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:46,157 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▏ | 28/1784 [01:39<1:36:24, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:47,811 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9379, 'learning_rate': 5.2e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:49,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 29/1784 [01:42<1:35:37, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:51,032 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7604, 'learning_rate': 5.4e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:52,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 30/1784 [01:45<1:34:46, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:54,176 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6528, 'learning_rate': 5.6e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:38:55,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 31/1784 [01:48<1:33:55, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:38:57,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:38:58,824 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▍ | 32/1784 [01:51<1:33:05, 3.19s/it] 2%|█▍ | 32/1784 [01:51<1:33:05, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:00,433 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7666, 'learning_rate': 6.000000000000001e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:01,894 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▍ | 33/1784 [01:55<1:32:00, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:03,481 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:04,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 34/1784 [01:58<1:30:40, 3.11s/it] 2%|█▌ | 34/1784 [01:58<1:30:40, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:06,435 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8797, 'learning_rate': 6.4e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:07,858 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 35/1784 [02:01<1:29:16, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:09,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.1419, 'learning_rate': 6.6e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:10,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 36/1784 [02:03<1:28:12, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:12,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7931, 'learning_rate': 6.800000000000001e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:13,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 37/1784 [02:06<1:26:35, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:15,129 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8808, 'learning_rate': 7.000000000000001e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:16,460 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 38/1784 [02:09<1:25:05, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:17,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:19,167 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 39/1784 [02:12<1:23:08, 2.86s/it] 2%|█▋ | 39/1784 [02:12<1:23:08, 2.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:20,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9645, 'learning_rate': 7.4e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:21,813 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 40/1784 [02:14<1:21:14, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:23,174 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8041, 'learning_rate': 7.6e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:24,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 41/1784 [02:17<1:19:21, 2.73s/it] 2%|█▊ | 41/1784 [02:17<1:19:21, 2.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:25,669 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:26,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 42/1784 [02:19<1:16:46, 2.64s/it] 2%|█▊ | 42/1784 [02:19<1:16:46, 2.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:28,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:29,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 43/1784 [02:22<1:13:43, 2.54s/it] 2%|█▉ | 43/1784 [02:22<1:13:43, 2.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:30,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:31,301 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 44/1784 [02:24<1:10:24, 2.43s/it] 2%|█▉ | 44/1784 [02:24<1:10:24, 2.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:32,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:33,281 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|█▉ | 45/1784 [02:26<1:06:29, 2.29s/it] 3%|█▉ | 45/1784 [02:26<1:06:29, 2.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:34,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:35,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██ | 46/1784 [02:28<1:01:44, 2.13s/it] 3%|██ | 46/1784 [02:28<1:01:44, 2.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:35,888 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:36,604 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 47/1784 [02:29<56:50, 1.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:37,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.6315, 'learning_rate': 9.000000000000001e-07, 'epoch': 0.03} {'loss': 5.6405, 'learning_rate': 9.200000000000001e-07, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-02-28 09:39:38,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 48/1784 [02:31<52:09, 1.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:38,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:39,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 49/1784 [02:32<47:22, 1.64s/it] 3%|██▏ | 49/1784 [02:32<47:22, 1.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:39,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:39:41,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 50/1784 [02:34<48:33, 1.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:43,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 50/1784 [02:34<48:33, 1.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:43,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 51/1784 [02:38<1:08:22, 2.37s/it]g-point operations will not be computed-28 09:39:43,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 51/1784 [02:38<1:08:22, 2.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:46,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 51/1784 [02:38<1:08:22, 2.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:46,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 52/1784 [02:41<1:20:31, 2.79s/it]g-point operations will not be computed-28 09:39:46,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 52/1784 [02:41<1:20:31, 2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:50,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 52/1784 [02:41<1:20:31, 2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:50,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 53/1784 [02:45<1:29:28, 3.10s/it]g-point operations will not be computed-28 09:39:50,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 53/1784 [02:45<1:29:28, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:54,562 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 53/1784 [02:45<1:29:28, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:54,562 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 54/1784 [02:49<1:34:58, 3.29s/it]g-point operations will not be computed-28 09:39:54,562 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 54/1784 [02:49<1:34:58, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:58,257 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 54/1784 [02:49<1:34:58, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:39:58,257 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 55/1784 [02:53<1:38:02, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:01,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 55/1784 [02:53<1:38:02, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:01,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:40:03,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 56/1784 [02:56<1:39:55, 3.47s/it] 3%|██▍ | 56/1784 [02:56<1:39:55, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:05,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 56/1784 [02:56<1:39:55, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:05,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 57/1784 [03:00<1:41:15, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:05,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 57/1784 [03:00<1:41:15, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:09,137 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 57/1784 [03:00<1:41:15, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:09,137 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 58/1784 [03:04<1:41:51, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:09,137 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 58/1784 [03:04<1:41:51, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:12,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 58/1784 [03:04<1:41:51, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:12,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 59/1784 [03:07<1:41:44, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:12,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 59/1784 [03:07<1:41:44, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:16,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▌ | 59/1784 [03:07<1:41:44, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:16,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 60/1784 [03:11<1:42:18, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:19,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 60/1784 [03:11<1:42:18, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:19,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 61/1784 [03:14<1:41:49, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:19,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 61/1784 [03:14<1:41:49, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:23,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 61/1784 [03:14<1:41:49, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:23,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 62/1784 [03:18<1:41:21, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:23,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 62/1784 [03:18<1:41:21, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:26,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 62/1784 [03:18<1:41:21, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:26,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 63/1784 [03:21<1:41:08, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:30,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 64/1784 [03:25<1:41:06, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:30,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 64/1784 [03:25<1:41:06, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:30,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 64/1784 [03:25<1:41:06, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:33,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▊ | 64/1784 [03:25<1:41:06, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:33,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 65/1784 [03:28<1:40:28, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:33,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 65/1784 [03:28<1:40:28, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:37,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 65/1784 [03:28<1:40:28, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:37,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 66/1784 [03:32<1:39:38, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:37,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 66/1784 [03:32<1:39:38, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:37,296 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 67/1784 [03:35<1:39:24, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:40,752 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 67/1784 [03:35<1:39:24, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:44,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 67/1784 [03:35<1:39:24, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:44,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 68/1784 [03:38<1:38:26, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:44,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 68/1784 [03:38<1:38:26, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:47,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 68/1784 [03:38<1:38:26, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:47,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 69/1784 [03:42<1:37:43, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:50,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 70/1784 [03:45<1:37:24, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:50,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 70/1784 [03:45<1:37:24, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:50,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 70/1784 [03:45<1:37:24, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:54,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███ | 70/1784 [03:45<1:37:24, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:54,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 71/1784 [03:49<1:36:34, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:54,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 71/1784 [03:49<1:36:34, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:57,605 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 71/1784 [03:49<1:36:34, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:40:57,605 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 72/1784 [03:52<1:36:24, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:00,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 73/1784 [03:55<1:36:03, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:00,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 73/1784 [03:55<1:36:03, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:00,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 73/1784 [03:55<1:36:03, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:04,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 73/1784 [03:55<1:36:03, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:04,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 74/1784 [03:59<1:35:25, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:04,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:41:07,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:41:07,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 75/1784 [04:02<1:34:53, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:10,826 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 75/1784 [04:02<1:34:53, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:10,826 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▎ | 76/1784 [04:05<1:33:47, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:10,826 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 77/1784 [04:08<1:32:57, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:14,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 77/1784 [04:08<1:32:57, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:14,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 77/1784 [04:08<1:32:57, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:17,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 77/1784 [04:08<1:32:57, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:17,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 78/1784 [04:11<1:32:15, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:17,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 78/1784 [04:11<1:32:15, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:20,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 78/1784 [04:11<1:32:15, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:20,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 79/1784 [04:15<1:31:18, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:23,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 79/1784 [04:15<1:31:18, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:23,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▌ | 80/1784 [04:18<1:30:28, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:23,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:41:26,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:41:26,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▌ | 81/1784 [04:21<1:29:34, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:29,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 82/1784 [04:24<1:28:22, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:29,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 82/1784 [04:24<1:28:22, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:29,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 83/1784 [04:27<1:27:56, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:32,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 83/1784 [04:27<1:27:56, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:32,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 83/1784 [04:27<1:27:56, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:35,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 84/1784 [04:30<1:26:35, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:35,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 84/1784 [04:30<1:26:35, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:35,736 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 85/1784 [04:33<1:25:06, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:38,653 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 85/1784 [04:33<1:25:06, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:38,653 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 85/1784 [04:33<1:25:06, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:41,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 85/1784 [04:33<1:25:06, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:41,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 86/1784 [04:36<1:23:51, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:44,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 87/1784 [04:38<1:22:44, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:44,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 87/1784 [04:38<1:22:44, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:44,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 88/1784 [04:41<1:20:48, 2.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:47,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 88/1784 [04:41<1:20:48, 2.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:47,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 88/1784 [04:41<1:20:48, 2.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:49,853 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 88/1784 [04:41<1:20:48, 2.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:49,853 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 89/1784 [04:44<1:18:55, 2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:52,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 90/1784 [04:46<1:16:59, 2.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:54,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 90/1784 [04:46<1:16:59, 2.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:54,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 91/1784 [04:49<1:13:55, 2.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:54,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 91/1784 [04:49<1:13:55, 2.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:54,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 92/1784 [04:51<1:11:07, 2.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:57,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 92/1784 [04:51<1:11:07, 2.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:57,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 93/1784 [04:53<1:07:44, 2.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:59,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 93/1784 [04:53<1:07:44, 2.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:41:59,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 94/1784 [04:55<1:03:38, 2.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:01,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 94/1784 [04:55<1:03:38, 2.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:01,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 95/1784 [04:57<59:13, 2.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:03,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 95/1784 [04:57<59:13, 2.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:03,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 96/1784 [04:58<55:19, 1.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:06,566 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 96/1784 [04:58<55:19, 1.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:06,566 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5352, 'learning_rate': 1.8600000000000002e-06, 'epoch': 0.05} 5%|████▍ | 98/1784 [05:01<46:31, 1.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:07,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▍ | 98/1784 [05:01<46:31, 1.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:07,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 99/1784 [05:02<43:08, 1.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:10,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 99/1784 [05:02<43:08, 1.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:10,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.1079, 'learning_rate': 1.9200000000000003e-06, 'epoch': 0.06} 6%|████▍ | 100/1784 [05:04<43:51, 1.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:10,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 100/1784 [05:04<43:51, 1.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:13,430 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 100/1784 [05:04<43:51, 1.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:13,430 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 101/1784 [05:08<1:03:05, 2.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:13,430 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 102/1784 [05:12<1:15:29, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:17,200 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 102/1784 [05:12<1:15:29, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:17,200 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▍ | 102/1784 [05:12<1:15:29, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:20,887 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 103/1784 [05:15<1:23:27, 2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:20,887 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 103/1784 [05:15<1:23:27, 2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:20,887 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 103/1784 [05:15<1:23:27, 2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:24,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 104/1784 [05:19<1:29:25, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:24,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 104/1784 [05:19<1:29:25, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:24,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 104/1784 [05:19<1:29:25, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:28,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 104/1784 [05:19<1:29:25, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:28,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 105/1784 [05:23<1:32:58, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:28,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 106/1784 [05:26<1:35:20, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:31,809 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 106/1784 [05:26<1:35:20, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:31,809 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 106/1784 [05:26<1:35:20, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:35,410 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 107/1784 [05:30<1:36:23, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:35,410 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 107/1784 [05:30<1:36:23, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:35,410 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 107/1784 [05:30<1:36:23, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:38,972 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 108/1784 [05:33<1:37:29, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:38,972 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 108/1784 [05:33<1:37:29, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:38,972 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 108/1784 [05:33<1:37:29, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:42,497 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 109/1784 [05:37<1:37:55, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:46,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 109/1784 [05:37<1:37:55, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:46,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 110/1784 [05:40<1:38:18, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:46,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 110/1784 [05:40<1:38:18, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:46,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 110/1784 [05:40<1:38:18, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:49,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 111/1784 [05:44<1:38:24, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:49,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 111/1784 [05:44<1:38:24, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:49,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 111/1784 [05:44<1:38:24, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:53,099 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 112/1784 [05:47<1:37:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:56,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 112/1784 [05:47<1:37:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:56,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2627, 'learning_rate': 2.16e-06, 'epoch': 0.06} 6%|████▉ | 113/1784 [05:51<1:36:52, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:56,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 113/1784 [05:51<1:36:52, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:59,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 113/1784 [05:51<1:36:52, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:59,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 114/1784 [05:54<1:37:02, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:42:59,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 114/1784 [05:54<1:37:02, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:03,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 114/1784 [05:54<1:37:02, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:03,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 115/1784 [05:58<1:36:34, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:03,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 115/1784 [05:58<1:36:34, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:03,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 115/1784 [05:58<1:36:34, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:06,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████ | 116/1784 [06:01<1:36:35, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:10,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████ | 116/1784 [06:01<1:36:35, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:10,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████ | 117/1784 [06:05<1:36:06, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:10,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████ | 117/1784 [06:05<1:36:06, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:13,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████ | 117/1784 [06:05<1:36:06, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:13,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 118/1784 [06:08<1:35:01, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:13,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 119/1784 [06:11<1:34:28, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:17,113 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 119/1784 [06:11<1:34:28, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:17,113 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 119/1784 [06:11<1:34:28, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:20,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 119/1784 [06:11<1:34:28, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:20,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 120/1784 [06:15<1:33:41, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:20,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 120/1784 [06:15<1:33:41, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:23,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▏ | 120/1784 [06:15<1:33:41, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:23,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 121/1784 [06:18<1:32:52, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:27,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 121/1784 [06:18<1:32:52, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:27,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 122/1784 [06:21<1:32:54, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:27,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 122/1784 [06:21<1:32:54, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:30,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▎ | 122/1784 [06:21<1:32:54, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:30,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 123/1784 [06:25<1:32:25, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:30,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 124/1784 [06:28<1:31:54, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:33,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 124/1784 [06:28<1:31:54, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:33,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 124/1784 [06:28<1:31:54, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:36,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 124/1784 [06:28<1:31:54, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:36,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 125/1784 [06:31<1:31:17, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:36,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 126/1784 [06:34<1:30:29, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:40,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 126/1784 [06:34<1:30:29, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:40,197 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 126/1784 [06:34<1:30:29, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:43,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 126/1784 [06:34<1:30:29, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:43,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 127/1784 [06:38<1:29:39, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:43,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 127/1784 [06:38<1:29:39, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:46,559 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 127/1784 [06:38<1:29:39, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:46,559 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 128/1784 [06:41<1:28:52, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:46,559 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▌ | 128/1784 [06:41<1:28:52, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:46,559 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 129/1784 [06:44<1:28:21, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:49,701 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 129/1784 [06:44<1:28:21, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:52,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 129/1784 [06:44<1:28:21, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:52,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 130/1784 [06:47<1:27:28, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:52,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 130/1784 [06:47<1:27:28, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:52,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 131/1784 [06:50<1:26:33, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:55,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 131/1784 [06:50<1:26:33, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:58,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 131/1784 [06:50<1:26:33, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:58,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 132/1784 [06:53<1:25:44, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:58,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 132/1784 [06:53<1:25:44, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:43:58,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 133/1784 [06:56<1:25:00, 3.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:02,048 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 133/1784 [06:56<1:25:00, 3.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:05,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▊ | 133/1784 [06:56<1:25:00, 3.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:05,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▊ | 134/1784 [06:59<1:23:25, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:07,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▊ | 134/1784 [06:59<1:23:25, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:07,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 135/1784 [07:02<1:22:28, 3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:07,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 135/1784 [07:02<1:22:28, 3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:10,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 135/1784 [07:02<1:22:28, 3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:10,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 136/1784 [07:05<1:21:27, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:13,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 136/1784 [07:05<1:21:27, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:13,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|█████▉ | 137/1784 [07:08<1:19:54, 2.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:16,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 138/1784 [07:10<1:18:36, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:16,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 138/1784 [07:10<1:18:36, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:16,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 139/1784 [07:13<1:16:45, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:19,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 139/1784 [07:13<1:16:45, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:19,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 139/1784 [07:13<1:16:45, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:21,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 139/1784 [07:13<1:16:45, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:21,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 140/1784 [07:16<1:15:01, 2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████ | 140/1784 [07:16<1:15:01, 2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 141/1784 [07:18<1:12:50, 2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:26,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 141/1784 [07:18<1:12:50, 2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:26,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 142/1784 [07:20<1:10:15, 2.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:26,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 143/1784 [07:23<1:07:32, 2.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:29,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 143/1784 [07:23<1:07:32, 2.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:29,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 144/1784 [07:25<1:04:00, 2.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:31,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 144/1784 [07:25<1:04:00, 2.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:31,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 145/1784 [07:27<1:00:21, 2.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:33,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▎ | 145/1784 [07:27<1:00:21, 2.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:33,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7135, 'learning_rate': 2.82e-06, 'epoch': 0.08} 8%|██████▌ | 146/1784 [07:28<56:32, 2.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:36,612 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 146/1784 [07:28<56:32, 2.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:36,612 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 148/1784 [07:31<48:09, 1.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:38,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 148/1784 [07:31<48:09, 1.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:38,093 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7001, 'learning_rate': 2.88e-06, 'epoch': 0.08} 8%|██████▋ | 149/1784 [07:33<44:04, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:40,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 149/1784 [07:33<44:04, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:40,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 150/1784 [07:34<44:48, 1.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:40,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 150/1784 [07:34<44:48, 1.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:43,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 150/1784 [07:34<44:48, 1.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:43,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 151/1784 [07:38<1:03:02, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:43,785 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 151/1784 [07:38<1:03:02, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:47,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▌ | 151/1784 [07:38<1:03:02, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:47,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 152/1784 [07:42<1:14:32, 2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:51,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 152/1784 [07:42<1:14:32, 2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:51,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 153/1784 [07:46<1:22:36, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:54,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 154/1784 [07:49<1:27:51, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:54,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 154/1784 [07:49<1:27:51, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:54,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▋ | 154/1784 [07:49<1:27:51, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:58,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 155/1784 [07:53<1:31:08, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:58,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 155/1784 [07:53<1:31:08, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:44:58,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 155/1784 [07:53<1:31:08, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:02,263 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 155/1784 [07:53<1:31:08, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:02,263 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 156/1784 [07:57<1:33:18, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:02,263 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 156/1784 [07:57<1:33:18, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:05,858 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 156/1784 [07:57<1:33:18, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:05,858 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▊ | 157/1784 [08:00<1:34:24, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:09,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 158/1784 [08:04<1:34:59, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:09,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 158/1784 [08:04<1:34:59, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:09,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 158/1784 [08:04<1:34:59, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:12,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 159/1784 [08:07<1:35:12, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:12,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 159/1784 [08:07<1:35:12, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:12,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 159/1784 [08:07<1:35:12, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:16,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 159/1784 [08:07<1:35:12, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:16,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 160/1784 [08:11<1:35:20, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:16,537 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 160/1784 [08:11<1:35:20, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:20,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|██████▉ | 160/1784 [08:11<1:35:20, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:20,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 161/1784 [08:14<1:35:19, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:23,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 162/1784 [08:18<1:34:47, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:23,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 162/1784 [08:18<1:34:47, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:23,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4913, 'learning_rate': 3.1600000000000002e-06, 'epoch': 0.09} 9%|███████ | 162/1784 [08:18<1:34:47, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:23,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 163/1784 [08:21<1:34:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:23,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 163/1784 [08:21<1:34:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:23,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 163/1784 [08:21<1:34:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:23,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 164/1784 [08:25<1:33:44, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 165/1784 [08:28<1:33:24, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 165/1784 [08:28<1:33:24, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3314, 'learning_rate': 3.2200000000000005e-06, 'epoch': 0.09} 9%|███████▎ | 166/1784 [08:32<1:32:39, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 166/1784 [08:32<1:32:39, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:45:42,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:45:42,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3964, 'learning_rate': 3.2600000000000006e-06, 'epoch': 0.09} 9%|███████▎ | 168/1784 [08:38<1:31:13, 3.39s/it]g-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 168/1784 [08:38<1:31:13, 3.39s/it]g-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.088, 'learning_rate': 3.2800000000000004e-06, 'epoch': 0.09} 9%|███████▎ | 168/1784 [08:38<1:31:13, 3.39s/it]g-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 169/1784 [08:42<1:30:48, 3.37s/it]g-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:45:52,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:45:52,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.344, 'learning_rate': 3.3200000000000004e-06, 'epoch': 0.1} 10%|███████▍ | 171/1784 [08:48<1:30:25, 3.36s/it]g-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▍ | 171/1784 [08:48<1:30:25, 3.36s/it]g-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4334, 'learning_rate': 3.3400000000000006e-06, 'epoch': 0.1} 10%|███████▍ | 171/1784 [08:48<1:30:25, 3.36s/it]g-point operations will not be computed-28 09:45:33,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▌ | 172/1784 [08:52<1:30:10, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▌ | 173/1784 [08:55<1:29:46, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▌ | 173/1784 [08:55<1:29:46, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3677, 'learning_rate': 3.3800000000000007e-06, 'epoch': 0.1} 10%|███████▌ | 174/1784 [08:58<1:28:56, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▌ | 174/1784 [08:58<1:28:56, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:08,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:08,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4863, 'learning_rate': 3.4200000000000007e-06, 'epoch': 0.1} 10%|███████▋ | 176/1784 [09:05<1:27:41, 3.27s/it]g-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▋ | 176/1784 [09:05<1:27:41, 3.27s/it]g-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:15,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:15,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2847, 'learning_rate': 3.46e-06, 'epoch': 0.1} 10%|███████▊ | 178/1784 [09:11<1:26:30, 3.23s/it]g-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 178/1784 [09:11<1:26:30, 3.23s/it]g-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2671, 'learning_rate': 3.48e-06, 'epoch': 0.1} 10%|███████▊ | 178/1784 [09:11<1:26:30, 3.23s/it]g-point operations will not be computed-28 09:46:00,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 179/1784 [09:14<1:25:35, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:23,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 180/1784 [09:17<1:24:53, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:23,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 180/1784 [09:17<1:24:53, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:23,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4798, 'learning_rate': 3.52e-06, 'epoch': 0.1} 10%|███████▊ | 180/1784 [09:17<1:24:53, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:23,189 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 181/1784 [09:20<1:24:30, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:29,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 182/1784 [09:24<1:23:32, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:29,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 182/1784 [09:24<1:23:32, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:29,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2923, 'learning_rate': 3.5600000000000002e-06, 'epoch': 0.1} 10%|███████▉ | 182/1784 [09:24<1:23:32, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:29,414 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 183/1784 [09:27<1:22:22, 3.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 184/1784 [09:29<1:20:51, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 184/1784 [09:29<1:20:51, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:39,668 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:39,668 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4057, 'learning_rate': 3.62e-06, 'epoch': 0.1} 10%|████████▏ | 186/1784 [09:35<1:18:43, 2.96s/it]g-point operations will not be computed-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 186/1784 [09:35<1:18:43, 2.96s/it]g-point operations will not be computed-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:45,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:45,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2751, 'learning_rate': 3.66e-06, 'epoch': 0.1} [WARNING|modeling_utils.py:388] 2022-02-28 09:46:45,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:35,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▏ | 188/1784 [09:41<1:17:30, 2.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 189/1784 [09:44<1:15:33, 2.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▎ | 189/1784 [09:44<1:15:33, 2.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:53,533 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:53,533 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:55,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:46:55,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4653, 'learning_rate': 3.74e-06, 'epoch': 0.11} [WARNING|modeling_utils.py:388] 2022-02-28 09:46:55,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:46:49,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 192/1784 [09:51<1:08:05, 2.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:59,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 192/1784 [09:51<1:08:05, 2.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:46:59,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 193/1784 [09:53<1:04:45, 2.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:01,558 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 193/1784 [09:53<1:04:45, 2.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:01,558 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 194/1784 [09:55<1:01:52, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:03,587 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▍ | 194/1784 [09:55<1:01:52, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:03,587 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 195/1784 [09:57<58:49, 2.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:05,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 195/1784 [09:57<58:49, 2.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:05,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 197/1784 [10:01<51:17, 1.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:07,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 197/1784 [10:01<51:17, 1.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:07,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 198/1784 [10:02<47:17, 1.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:10,006 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 198/1784 [10:02<47:17, 1.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:10,006 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4062, 'learning_rate': 3.88e-06, 'epoch': 0.11} 11%|████████▉ | 199/1784 [10:03<43:07, 1.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:11,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 199/1784 [10:03<43:07, 1.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:11,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 200/1784 [10:05<43:34, 1.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:11,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 201/1784 [10:09<1:01:08, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:14,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 201/1784 [10:09<1:01:08, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:14,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 201/1784 [10:09<1:01:08, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 202/1784 [10:13<1:12:30, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▊ | 202/1784 [10:13<1:12:30, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2316, 'learning_rate': 3.96e-06, 'epoch': 0.11} 11%|████████▉ | 203/1784 [10:16<1:19:36, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 203/1784 [10:16<1:19:36, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4342, 'learning_rate': 3.980000000000001e-06, 'epoch': 0.11} 11%|████████▉ | 204/1784 [10:20<1:24:47, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 204/1784 [10:20<1:24:47, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1104, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.11} 11%|████████▉ | 204/1784 [10:20<1:24:47, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 205/1784 [10:23<1:27:32, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:47:34,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:47:34,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3397, 'learning_rate': 4.04e-06, 'epoch': 0.12} 12%|█████████ | 207/1784 [10:31<1:30:05, 3.43s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 207/1784 [10:31<1:30:05, 3.43s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4085, 'learning_rate': 4.060000000000001e-06, 'epoch': 0.12} 12%|█████████ | 208/1784 [10:34<1:31:12, 3.47s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████ | 208/1784 [10:34<1:31:12, 3.47s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:47:44,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:47:44,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:47:44,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 210/1784 [10:41<1:31:35, 3.49s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 210/1784 [10:41<1:31:35, 3.49s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3254, 'learning_rate': 4.12e-06, 'epoch': 0.12} 12%|█████████▏ | 211/1784 [10:45<1:31:35, 3.49s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▏ | 211/1784 [10:45<1:31:35, 3.49s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:47:55,409 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:47:55,409 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2394, 'learning_rate': 4.16e-06, 'epoch': 0.12} [WARNING|modeling_utils.py:388] 2022-02-28 09:47:55,409 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 213/1784 [10:51<1:30:31, 3.46s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 213/1784 [10:51<1:30:31, 3.46s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 213/1784 [10:51<1:30:31, 3.46s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▎ | 214/1784 [10:55<1:30:18, 3.45s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:05,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:05,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4445, 'learning_rate': 4.22e-06, 'epoch': 0.12} [WARNING|modeling_utils.py:388] 2022-02-28 09:48:05,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 216/1784 [11:02<1:29:30, 3.43s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 216/1784 [11:02<1:29:30, 3.43s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 216/1784 [11:02<1:29:30, 3.43s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 217/1784 [11:05<1:28:57, 3.41s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 217/1784 [11:05<1:28:57, 3.41s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 217/1784 [11:05<1:28:57, 3.41s/it]g-point operations will not be computed-28 09:47:18,082 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 218/1784 [11:08<1:28:31, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 218/1784 [11:08<1:28:31, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 219/1784 [11:12<1:27:36, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 219/1784 [11:12<1:27:36, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 219/1784 [11:12<1:27:36, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▌ | 220/1784 [11:15<1:26:56, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:25,622 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:25,622 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3815, 'learning_rate': 4.34e-06, 'epoch': 0.12} 12%|█████████▋ | 222/1784 [11:22<1:26:16, 3.31s/it]g-point operations will not be computed-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 222/1784 [11:22<1:26:16, 3.31s/it]g-point operations will not be computed-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3849, 'learning_rate': 4.360000000000001e-06, 'epoch': 0.12} 12%|█████████▋ | 222/1784 [11:22<1:26:16, 3.31s/it]g-point operations will not be computed-28 09:48:17,470 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▊ | 223/1784 [11:25<1:26:08, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:33,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▊ | 223/1784 [11:25<1:26:08, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:33,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 224/1784 [11:28<1:25:50, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:33,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 224/1784 [11:28<1:25:50, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:33,916 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 225/1784 [11:31<1:24:54, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▊ | 225/1784 [11:31<1:24:54, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▉ | 226/1784 [11:35<1:24:31, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▉ | 226/1784 [11:35<1:24:31, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4939, 'learning_rate': 4.440000000000001e-06, 'epoch': 0.13} 13%|█████████▉ | 227/1784 [11:38<1:23:35, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|█████████▉ | 227/1784 [11:38<1:23:35, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:48,195 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:48,195 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0854, 'learning_rate': 4.48e-06, 'epoch': 0.13} 13%|██████████ | 229/1784 [11:44<1:22:20, 3.18s/it]g-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 229/1784 [11:44<1:22:20, 3.18s/it]g-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:54,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:48:54,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2702, 'learning_rate': 4.520000000000001e-06, 'epoch': 0.13} 13%|██████████ | 231/1784 [11:50<1:20:54, 3.13s/it]g-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████ | 231/1784 [11:50<1:20:54, 3.13s/it]g-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:00,501 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:00,501 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:48:40,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2866, 'learning_rate': 4.56e-06, 'epoch': 0.13} 13%|██████████▏ | 233/1784 [11:56<1:19:03, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 233/1784 [11:56<1:19:03, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.447, 'learning_rate': 4.58e-06, 'epoch': 0.13} 13%|██████████▏ | 234/1784 [11:59<1:17:50, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:09,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:09,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:09,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 236/1784 [12:05<1:15:27, 2.92s/it]g-point operations will not be computed-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▎ | 236/1784 [12:05<1:15:27, 2.92s/it]g-point operations will not be computed-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:14,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:14,833 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:04,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3739, 'learning_rate': 4.66e-06, 'epoch': 0.13} 13%|██████████▍ | 238/1784 [12:10<1:12:27, 2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:18,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 238/1784 [12:10<1:12:27, 2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:18,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3531, 'learning_rate': 4.680000000000001e-06, 'epoch': 0.13} 13%|██████████▍ | 239/1784 [12:13<1:10:54, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 239/1784 [12:13<1:10:54, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 240/1784 [12:15<1:08:41, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 240/1784 [12:15<1:08:41, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:24,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:24,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:26,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:26,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:28,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:28,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:30,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:30,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:32,354 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:32,354 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:35,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:35,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4221, 'learning_rate': 4.86e-06, 'epoch': 0.14} [WARNING|modeling_utils.py:388] 2022-02-28 09:49:36,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:36,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.179, 'learning_rate': 4.9000000000000005e-06, 'epoch': 0.14} [WARNING|modeling_utils.py:388] 2022-02-28 09:49:39,448 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:39,448 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:43,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:43,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2367, 'learning_rate': 4.94e-06, 'epoch': 0.14} [WARNING|modeling_utils.py:388] 2022-02-28 09:49:43,384 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:47,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:47,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:49:47,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 253/1784 [12:44<1:16:57, 3.02s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 253/1784 [12:44<1:16:57, 3.02s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 253/1784 [12:44<1:16:57, 3.02s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 254/1784 [12:47<1:22:30, 3.24s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 254/1784 [12:47<1:22:30, 3.24s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████ | 254/1784 [12:47<1:22:30, 3.24s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 255/1784 [12:51<1:25:36, 3.36s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:50:01,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:50:01,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2106, 'learning_rate': 5.04e-06, 'epoch': 0.14} [WARNING|modeling_utils.py:388] 2022-02-28 09:50:01,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 257/1784 [12:58<1:28:35, 3.48s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 257/1784 [12:58<1:28:35, 3.48s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 257/1784 [12:58<1:28:35, 3.48s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 258/1784 [13:02<1:29:07, 3.50s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 258/1784 [13:02<1:29:07, 3.50s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▎ | 258/1784 [13:02<1:29:07, 3.50s/it]g-point operations will not be computed-28 09:49:21,398 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▎ | 259/1784 [13:05<1:29:22, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:14,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▎ | 259/1784 [13:05<1:29:22, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:14,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▎ | 260/1784 [13:09<1:30:01, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▎ | 260/1784 [13:09<1:30:01, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 261/1784 [13:12<1:29:41, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 261/1784 [13:12<1:29:41, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 261/1784 [13:12<1:29:41, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 262/1784 [13:16<1:29:06, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 262/1784 [13:16<1:29:06, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 262/1784 [13:16<1:29:06, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:17,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 263/1784 [13:19<1:28:50, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▍ | 263/1784 [13:19<1:28:50, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 264/1784 [13:23<1:28:50, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 264/1784 [13:23<1:28:50, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 264/1784 [13:23<1:28:50, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 265/1784 [13:26<1:28:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 265/1784 [13:26<1:28:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▌ | 265/1784 [13:26<1:28:14, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 266/1784 [13:30<1:27:37, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 266/1784 [13:30<1:27:37, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:50:40,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:50:40,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:50:40,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 268/1784 [13:36<1:26:14, 3.41s/it]g-point operations will not be computed-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 268/1784 [13:36<1:26:14, 3.41s/it]g-point operations will not be computed-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▋ | 268/1784 [13:36<1:26:14, 3.41s/it]g-point operations will not be computed-28 09:50:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 269/1784 [13:40<1:25:33, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 269/1784 [13:40<1:25:33, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 270/1784 [13:43<1:25:18, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 270/1784 [13:43<1:25:18, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 270/1784 [13:43<1:25:18, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▊ | 271/1784 [13:46<1:24:50, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:50:57,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:50:57,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1232, 'learning_rate': 5.36e-06, 'epoch': 0.15} [WARNING|modeling_utils.py:388] 2022-02-28 09:50:57,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 273/1784 [13:53<1:23:53, 3.33s/it]g-point operations will not be computed-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 273/1784 [13:53<1:23:53, 3.33s/it]g-point operations will not be computed-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 273/1784 [13:53<1:23:53, 3.33s/it]g-point operations will not be computed-28 09:50:48,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 274/1784 [13:56<1:23:19, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 274/1784 [13:56<1:23:19, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 275/1784 [13:59<1:22:29, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 275/1784 [13:59<1:22:29, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 275/1784 [13:59<1:22:29, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 276/1784 [14:03<1:22:08, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████ | 276/1784 [14:03<1:22:08, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:13,270 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:13,270 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:13,270 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▏ | 278/1784 [14:09<1:20:48, 3.22s/it]g-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:19,565 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:19,565 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2352, 'learning_rate': 5.500000000000001e-06, 'epoch': 0.16} [WARNING|modeling_utils.py:388] 2022-02-28 09:51:19,565 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▏ | 280/1784 [14:15<1:19:33, 3.17s/it]g-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:25,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:25,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.312, 'learning_rate': 5.540000000000001e-06, 'epoch': 0.16} [WARNING|modeling_utils.py:388] 2022-02-28 09:51:25,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▎ | 282/1784 [14:22<1:18:29, 3.14s/it]g-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:31,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:31,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3619, 'learning_rate': 5.580000000000001e-06, 'epoch': 0.16} [WARNING|modeling_utils.py:388] 2022-02-28 09:51:31,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▍ | 284/1784 [14:28<1:16:32, 3.06s/it]g-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:37,819 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:37,819 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2813, 'learning_rate': 5.620000000000001e-06, 'epoch': 0.16} [WARNING|modeling_utils.py:388] 2022-02-28 09:51:37,819 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:05,254 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 286/1784 [14:33<1:14:22, 2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:42,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 286/1784 [14:33<1:14:22, 2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:42,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▌ | 287/1784 [14:36<1:13:23, 2.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:42,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:46,218 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:42,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:46,218 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:42,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5288, 'learning_rate': 5.68e-06, 'epoch': 0.16} [WARNING|modeling_utils.py:388] 2022-02-28 09:51:46,218 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:42,145 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 289/1784 [14:41<1:09:25, 2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 289/1784 [14:41<1:09:25, 2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 290/1784 [14:44<1:07:21, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▋ | 290/1784 [14:44<1:07:21, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:53,686 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:53,686 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:55,908 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:55,908 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:58,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:51:58,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:00,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:00,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:01,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:01,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:03,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:03,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:06,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:06,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5077, 'learning_rate': 5.8800000000000005e-06, 'epoch': 0.17} [WARNING|modeling_utils.py:388] 2022-02-28 09:52:08,060 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:08,060 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:09,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:09,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:09,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:13,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:13,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:13,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▏ | 302/1784 [15:10<1:08:45, 2.78s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:21,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:52:21,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3298, 'learning_rate': 5.98e-06, 'epoch': 0.17} 17%|█████████████▎ | 304/1784 [15:18<1:20:02, 3.25s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 304/1784 [15:18<1:20:02, 3.25s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1611, 'learning_rate': 6e-06, 'epoch': 0.17} 17%|█████████████▎ | 305/1784 [15:21<1:23:03, 3.37s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▎ | 305/1784 [15:21<1:23:03, 3.37s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3209, 'learning_rate': 6.02e-06, 'epoch': 0.17} 17%|█████████████▎ | 305/1784 [15:21<1:23:03, 3.37s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 306/1784 [15:25<1:24:49, 3.44s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 306/1784 [15:25<1:24:49, 3.44s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 306/1784 [15:25<1:24:49, 3.44s/it]g-point operations will not be computed-28 09:51:50,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 307/1784 [15:28<1:25:38, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 308/1784 [15:32<1:26:23, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▍ | 308/1784 [15:32<1:26:23, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4931, 'learning_rate': 6.08e-06, 'epoch': 0.17} 17%|█████████████▌ | 309/1784 [15:36<1:26:51, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 309/1784 [15:36<1:26:51, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4734, 'learning_rate': 6.1e-06, 'epoch': 0.17} 17%|█████████████▌ | 309/1784 [15:36<1:26:51, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 310/1784 [15:39<1:26:39, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 310/1784 [15:39<1:26:39, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 310/1784 [15:39<1:26:39, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:37,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▌ | 311/1784 [15:43<1:26:28, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:51,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 312/1784 [15:46<1:26:24, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:51,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 312/1784 [15:46<1:26:24, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:51,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2364, 'learning_rate': 6.16e-06, 'epoch': 0.17} 17%|█████████████▋ | 312/1784 [15:46<1:26:24, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:51,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▋ | 313/1784 [15:50<1:26:00, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:51,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▋ | 313/1784 [15:50<1:26:00, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:51,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▋ | 313/1784 [15:50<1:26:00, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:52:51,799 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▋ | 314/1784 [15:53<1:25:11, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 315/1784 [15:56<1:24:44, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 315/1784 [15:56<1:24:44, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4123, 'learning_rate': 6.220000000000001e-06, 'epoch': 0.18} 18%|█████████████▊ | 316/1784 [16:00<1:23:58, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▊ | 316/1784 [16:00<1:23:58, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:53:10,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:53:10,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2291, 'learning_rate': 6.26e-06, 'epoch': 0.18} 18%|█████████████▉ | 318/1784 [16:07<1:23:42, 3.43s/it]g-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 318/1784 [16:07<1:23:42, 3.43s/it]g-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9427, 'learning_rate': 6.280000000000001e-06, 'epoch': 0.18} 18%|█████████████▉ | 318/1784 [16:07<1:23:42, 3.43s/it]g-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 319/1784 [16:10<1:23:11, 3.41s/it]g-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 319/1784 [16:10<1:23:11, 3.41s/it]g-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 319/1784 [16:10<1:23:11, 3.41s/it]g-point operations will not be computed-28 09:53:02,172 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|█████████████▉ | 320/1784 [16:13<1:22:29, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████ | 321/1784 [16:17<1:21:41, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████ | 321/1784 [16:17<1:21:41, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5014, 'learning_rate': 6.34e-06, 'epoch': 0.18} 18%|██████████████ | 321/1784 [16:17<1:21:41, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████ | 322/1784 [16:20<1:21:29, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:53:30,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:53:30,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3505, 'learning_rate': 6.380000000000001e-06, 'epoch': 0.18} 18%|██████████████▏ | 324/1784 [16:26<1:20:02, 3.29s/it]g-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▏ | 324/1784 [16:26<1:20:02, 3.29s/it]g-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:53:36,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:53:36,969 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2019, 'learning_rate': 6.42e-06, 'epoch': 0.18} 18%|██████████████▎ | 326/1784 [16:33<1:18:56, 3.25s/it]g-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 326/1784 [16:33<1:18:56, 3.25s/it]g-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1783, 'learning_rate': 6.440000000000001e-06, 'epoch': 0.18} 18%|██████████████▎ | 326/1784 [16:33<1:18:56, 3.25s/it]g-point operations will not be computed-28 09:53:22,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 327/1784 [16:36<1:18:31, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:45,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 328/1784 [16:39<1:17:42, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:45,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 328/1784 [16:39<1:17:42, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:45,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2332, 'learning_rate': 6.480000000000001e-06, 'epoch': 0.18} 18%|██████████████▎ | 328/1784 [16:39<1:17:42, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:45,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 329/1784 [16:42<1:17:23, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:51,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 330/1784 [16:45<1:16:45, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:51,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▍ | 330/1784 [16:45<1:16:45, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:51,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4648, 'learning_rate': 6.520000000000001e-06, 'epoch': 0.18} 18%|██████████████▍ | 330/1784 [16:45<1:16:45, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:51,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▍ | 331/1784 [16:49<1:16:00, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:57,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▌ | 332/1784 [16:52<1:15:24, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:57,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▌ | 332/1784 [16:52<1:15:24, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:57,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3966, 'learning_rate': 6.560000000000001e-06, 'epoch': 0.19} 19%|██████████████▌ | 332/1784 [16:52<1:15:24, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:53:57,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▌ | 333/1784 [16:55<1:14:28, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▌ | 334/1784 [16:57<1:13:05, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▌ | 334/1784 [16:57<1:13:05, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:07,709 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:07,709 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3986, 'learning_rate': 6.620000000000001e-06, 'epoch': 0.19} [WARNING|modeling_utils.py:388] 2022-02-28 09:54:07,709 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▋ | 336/1784 [17:03<1:11:06, 2.95s/it]g-point operations will not be computed-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:13,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:13,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2845, 'learning_rate': 6.660000000000001e-06, 'epoch': 0.19} [WARNING|modeling_utils.py:388] 2022-02-28 09:54:13,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:03,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 338/1784 [17:09<1:09:11, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 338/1784 [17:09<1:09:11, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 339/1784 [17:12<1:07:48, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:21,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:21,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:24,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:24,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2032, 'learning_rate': 6.740000000000001e-06, 'epoch': 0.19} [WARNING|modeling_utils.py:388] 2022-02-28 09:54:24,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:17,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 342/1784 [17:19<1:02:49, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:27,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 343/1784 [17:21<1:00:23, 2.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:29,905 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▉ | 343/1784 [17:21<1:00:23, 2.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:29,905 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▍ | 344/1784 [17:24<57:48, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:31,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▍ | 344/1784 [17:24<57:48, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:31,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0314, 'learning_rate': 6.800000000000001e-06, 'epoch': 0.19} 19%|███████████████▍ | 345/1784 [17:26<54:31, 2.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:33,848 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▍ | 345/1784 [17:26<54:31, 2.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:33,848 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 346/1784 [17:27<51:18, 2.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:35,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 346/1784 [17:27<51:18, 2.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:35,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 348/1784 [17:30<43:56, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:37,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 348/1784 [17:30<43:56, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:37,160 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5665, 'learning_rate': 6.88e-06, 'epoch': 0.2} 20%|███████████████▋ | 349/1784 [17:32<39:56, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:39,749 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 350/1784 [17:34<40:22, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:39,749 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 350/1784 [17:34<40:22, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:39,749 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 350/1784 [17:34<40:22, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 350/1784 [17:34<40:22, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 351/1784 [17:37<56:25, 2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:48,500 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:54:48,500 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.412, 'learning_rate': 6.96e-06, 'epoch': 0.2} 20%|███████████████▍ | 353/1784 [17:45<1:12:27, 3.04s/it]g-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▍ | 353/1784 [17:45<1:12:27, 3.04s/it]g-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0509, 'learning_rate': 6.98e-06, 'epoch': 0.2} 20%|███████████████▍ | 354/1784 [17:48<1:16:56, 3.23s/it]g-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▍ | 354/1784 [17:48<1:16:56, 3.23s/it]g-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.222, 'learning_rate': 7e-06, 'epoch': 0.2} 20%|███████████████▌ | 355/1784 [17:52<1:19:57, 3.36s/it]g-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 355/1784 [17:52<1:19:57, 3.36s/it]g-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8822, 'learning_rate': 7.0200000000000006e-06, 'epoch': 0.2} 20%|███████████████▌ | 355/1784 [17:52<1:19:57, 3.36s/it]g-point operations will not be computed-28 09:54:42,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 356/1784 [17:56<1:21:37, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 357/1784 [17:59<1:22:12, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 357/1784 [17:59<1:22:12, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3387, 'learning_rate': 7.06e-06, 'epoch': 0.2} 20%|███████████████▋ | 358/1784 [18:03<1:22:54, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 358/1784 [18:03<1:22:54, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3317, 'learning_rate': 7.08e-06, 'epoch': 0.2} 20%|███████████████▋ | 359/1784 [18:06<1:23:22, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▋ | 359/1784 [18:06<1:23:22, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:17,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:17,324 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2113, 'learning_rate': 7.1200000000000004e-06, 'epoch': 0.2} 20%|███████████████▊ | 361/1784 [18:14<1:23:53, 3.54s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 361/1784 [18:14<1:23:53, 3.54s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2384, 'learning_rate': 7.14e-06, 'epoch': 0.2} 20%|███████████████▊ | 362/1784 [18:17<1:23:14, 3.51s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 362/1784 [18:17<1:23:14, 3.51s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0246, 'learning_rate': 7.16e-06, 'epoch': 0.2} 20%|███████████████▊ | 362/1784 [18:17<1:23:14, 3.51s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▊ | 363/1784 [18:20<1:22:39, 3.49s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:31,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:31,173 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.319, 'learning_rate': 7.2000000000000005e-06, 'epoch': 0.2} 20%|███████████████▉ | 365/1784 [18:27<1:21:48, 3.46s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 365/1784 [18:27<1:21:48, 3.46s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2642, 'learning_rate': 7.22e-06, 'epoch': 0.2} 21%|████████████████ | 366/1784 [18:31<1:20:56, 3.43s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████ | 366/1784 [18:31<1:20:56, 3.43s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:41,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:41,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2273, 'learning_rate': 7.260000000000001e-06, 'epoch': 0.21} 21%|████████████████ | 368/1784 [18:37<1:19:56, 3.39s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████ | 368/1784 [18:37<1:19:56, 3.39s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1541, 'learning_rate': 7.280000000000001e-06, 'epoch': 0.21} 21%|████████████████ | 368/1784 [18:37<1:19:56, 3.39s/it]g-point operations will not be computed-28 09:55:04,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 369/1784 [18:41<1:19:44, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 370/1784 [18:44<1:19:48, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 370/1784 [18:44<1:19:48, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1406, 'learning_rate': 7.32e-06, 'epoch': 0.21} 21%|████████████████▏ | 371/1784 [18:47<1:19:15, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▏ | 371/1784 [18:47<1:19:15, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:58,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:55:58,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5768, 'learning_rate': 7.360000000000001e-06, 'epoch': 0.21} 21%|████████████████▎ | 373/1784 [18:54<1:18:18, 3.33s/it]g-point operations will not be computed-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 373/1784 [18:54<1:18:18, 3.33s/it]g-point operations will not be computed-28 09:55:49,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2027, 'learning_rate': 7.3800000000000005e-06, 'epoch': 0.21} 21%|████████████████▎ | 374/1784 [18:57<1:17:31, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▎ | 374/1784 [18:57<1:17:31, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 375/1784 [19:00<1:16:36, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 375/1784 [19:00<1:16:36, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0183, 'learning_rate': 7.420000000000001e-06, 'epoch': 0.21} 21%|████████████████▍ | 376/1784 [19:04<1:15:50, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 376/1784 [19:04<1:15:50, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:14,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:14,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1234, 'learning_rate': 7.4600000000000006e-06, 'epoch': 0.21} 21%|████████████████▌ | 378/1784 [19:10<1:14:58, 3.20s/it]g-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 378/1784 [19:10<1:14:58, 3.20s/it]g-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:20,408 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:20,408 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.183, 'learning_rate': 7.500000000000001e-06, 'epoch': 0.21} 21%|████████████████▌ | 380/1784 [19:16<1:13:45, 3.15s/it]g-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▌ | 380/1784 [19:16<1:13:45, 3.15s/it]g-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:26,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:26,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9576, 'learning_rate': 7.540000000000001e-06, 'epoch': 0.21} 21%|████████████████▋ | 382/1784 [19:22<1:13:45, 3.16s/it]g-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 382/1784 [19:22<1:13:45, 3.16s/it]g-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:32,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:32,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:06,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2602, 'learning_rate': 7.58e-06, 'epoch': 0.21} 22%|████████████████▊ | 384/1784 [19:28<1:11:25, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▊ | 384/1784 [19:28<1:11:25, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▊ | 385/1784 [19:31<1:09:56, 3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▊ | 385/1784 [19:31<1:09:56, 3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:41,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:41,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3552, 'learning_rate': 7.640000000000001e-06, 'epoch': 0.22} [WARNING|modeling_utils.py:388] 2022-02-28 09:56:41,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▉ | 387/1784 [19:37<1:07:54, 2.92s/it]g-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|████████████████▉ | 387/1784 [19:37<1:07:54, 2.92s/it]g-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:46,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:49,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:49,603 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:37,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.385, 'learning_rate': 7.7e-06, 'epoch': 0.22} 22%|█████████████████ | 390/1784 [19:45<1:02:50, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 390/1784 [19:45<1:02:50, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 391/1784 [19:47<1:00:32, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████ | 391/1784 [19:47<1:00:32, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:56,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:56,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:58,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:56:58,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:00,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:00,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:02,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:02,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:05,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:05,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3002, 'learning_rate': 7.840000000000001e-06, 'epoch': 0.22} [WARNING|modeling_utils.py:388] 2022-02-28 09:57:08,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:08,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7235, 'learning_rate': 7.88e-06, 'epoch': 0.22} [WARNING|modeling_utils.py:388] 2022-02-28 09:57:09,779 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:09,779 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9442, 'learning_rate': 7.92e-06, 'epoch': 0.22} [WARNING|modeling_utils.py:388] 2022-02-28 09:57:13,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:13,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3973, 'learning_rate': 7.94e-06, 'epoch': 0.22} [WARNING|modeling_utils.py:388] 2022-02-28 09:57:17,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:17,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0756, 'learning_rate': 7.960000000000002e-06, 'epoch': 0.23} 23%|█████████████████▌ | 403/1784 [20:14<1:09:08, 3.00s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▌ | 403/1784 [20:14<1:09:08, 3.00s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1005, 'learning_rate': 7.980000000000002e-06, 'epoch': 0.23} 23%|█████████████████▋ | 404/1784 [20:17<1:13:21, 3.19s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▋ | 404/1784 [20:17<1:13:21, 3.19s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:28,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:28,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0213, 'learning_rate': 8.020000000000001e-06, 'epoch': 0.23} 23%|█████████████████▊ | 406/1784 [20:25<1:18:37, 3.42s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 406/1784 [20:25<1:18:37, 3.42s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2101, 'learning_rate': 8.040000000000001e-06, 'epoch': 0.23} 23%|█████████████████▊ | 407/1784 [20:28<1:19:24, 3.46s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 407/1784 [20:28<1:19:24, 3.46s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1213, 'learning_rate': 8.06e-06, 'epoch': 0.23} 23%|█████████████████▊ | 408/1784 [20:32<1:19:51, 3.48s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▊ | 408/1784 [20:32<1:19:51, 3.48s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:42,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:42,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3338, 'learning_rate': 8.1e-06, 'epoch': 0.23} 23%|█████████████████▉ | 410/1784 [20:39<1:20:10, 3.50s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 410/1784 [20:39<1:20:10, 3.50s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2651, 'learning_rate': 8.120000000000002e-06, 'epoch': 0.23} 23%|█████████████████▉ | 411/1784 [20:42<1:19:56, 3.49s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|█████████████████▉ | 411/1784 [20:42<1:19:56, 3.49s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:53,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:57:53,128 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2575, 'learning_rate': 8.16e-06, 'epoch': 0.23} 23%|██████████████████ | 413/1784 [20:49<1:19:51, 3.49s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 413/1784 [20:49<1:19:51, 3.49s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2992, 'learning_rate': 8.18e-06, 'epoch': 0.23} 23%|██████████████████ | 414/1784 [20:53<1:19:18, 3.47s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 414/1784 [20:53<1:19:18, 3.47s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.258, 'learning_rate': 8.2e-06, 'epoch': 0.23} 23%|██████████████████▏ | 415/1784 [20:56<1:19:15, 3.47s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 415/1784 [20:56<1:19:15, 3.47s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:06,922 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:06,922 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1584, 'learning_rate': 8.24e-06, 'epoch': 0.23} 23%|██████████████████▏ | 417/1784 [21:03<1:18:33, 3.45s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▏ | 417/1784 [21:03<1:18:33, 3.45s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2493, 'learning_rate': 8.26e-06, 'epoch': 0.23} 23%|██████████████████▎ | 418/1784 [21:06<1:18:34, 3.45s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 418/1784 [21:06<1:18:34, 3.45s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:17,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:17,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2966, 'learning_rate': 8.3e-06, 'epoch': 0.23} 24%|██████████████████▎ | 420/1784 [21:13<1:17:33, 3.41s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▎ | 420/1784 [21:13<1:17:33, 3.41s/it]g-point operations will not be computed-28 09:56:53,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2481, 'learning_rate': 8.32e-06, 'epoch': 0.24} 24%|██████████████████▍ | 421/1784 [21:17<1:17:15, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:25,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▍ | 421/1784 [21:17<1:17:15, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:25,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▍ | 422/1784 [21:20<1:16:19, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:25,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▍ | 422/1784 [21:20<1:16:19, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:25,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4645, 'learning_rate': 8.36e-06, 'epoch': 0.24} 24%|██████████████████▍ | 423/1784 [21:23<1:15:34, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:25,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▍ | 423/1784 [21:23<1:15:34, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:25,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0371, 'learning_rate': 8.380000000000001e-06, 'epoch': 0.24} 24%|██████████████████▍ | 423/1784 [21:23<1:15:34, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:25,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 424/1784 [21:26<1:14:47, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 425/1784 [21:30<1:14:04, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▌ | 425/1784 [21:30<1:14:04, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:40,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:40,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.275, 'learning_rate': 8.44e-06, 'epoch': 0.24} 24%|██████████████████▋ | 427/1784 [21:36<1:13:05, 3.23s/it]g-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▋ | 427/1784 [21:36<1:13:05, 3.23s/it]g-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.06, 'learning_rate': 8.46e-06, 'epoch': 0.24} 24%|██████████████████▋ | 428/1784 [21:39<1:12:41, 3.22s/it]g-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▋ | 428/1784 [21:39<1:12:41, 3.22s/it]g-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:49,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:58:49,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:58:35,348 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2012, 'learning_rate': 8.5e-06, 'epoch': 0.24} 24%|██████████████████▊ | 430/1784 [21:45<1:11:08, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:54,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 430/1784 [21:45<1:11:08, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:54,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 431/1784 [21:48<1:10:43, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:54,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 431/1784 [21:48<1:10:43, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:58:54,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1564, 'learning_rate': 8.540000000000001e-06, 'epoch': 0.24} 24%|██████████████████▉ | 432/1784 [21:51<1:09:42, 3.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 432/1784 [21:51<1:09:42, 3.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 433/1784 [21:54<1:08:23, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▉ | 433/1784 [21:54<1:08:23, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2468, 'learning_rate': 8.580000000000001e-06, 'epoch': 0.24} [WARNING|modeling_utils.py:388] 2022-02-28 09:59:04,560 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:04,560 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:04,560 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 435/1784 [22:00<1:06:32, 2.96s/it]g-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 435/1784 [22:00<1:06:32, 2.96s/it]g-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:10,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:10,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:10,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:00,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 437/1784 [22:06<1:04:03, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 437/1784 [22:06<1:04:03, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▏ | 438/1784 [22:08<1:03:25, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▏ | 438/1784 [22:08<1:03:25, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:18,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:18,353 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:20,923 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:20,923 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 09:59:20,923 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 09:59:14,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 441/1784 [22:16<58:51, 2.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:24,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 441/1784 [22:16<58:51, 2.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:24,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 442/1784 [22:18<55:55, 2.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:26,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 442/1784 [22:18<55:55, 2.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:26,652 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 443/1784 [22:20<52:40, 2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:28,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 443/1784 [22:20<52:40, 2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:28,582 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▉ | 444/1784 [22:22<49:29, 2.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:30,441 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▉ | 444/1784 [22:22<49:29, 2.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:30,441 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 446/1784 [22:26<43:38, 1.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:32,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 446/1784 [22:26<43:38, 1.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:32,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9054, 'learning_rate': 8.84e-06, 'epoch': 0.25} 25%|████████████████████ | 447/1784 [22:27<40:16, 1.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:35,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████ | 447/1784 [22:27<40:16, 1.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:35,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 449/1784 [22:30<34:17, 1.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:36,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 449/1784 [22:30<34:17, 1.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:36,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4318, 'learning_rate': 8.900000000000001e-06, 'epoch': 0.25} 25%|████████████████████▏ | 450/1784 [22:31<35:35, 1.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:37,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 450/1784 [22:31<35:35, 1.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:40,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 450/1784 [22:31<35:35, 1.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:40,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 451/1784 [22:35<51:14, 2.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:40,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 451/1784 [22:35<51:14, 2.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 451/1784 [22:35<51:14, 2.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 452/1784 [22:39<1:01:16, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 452/1784 [22:39<1:01:16, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 452/1784 [22:39<1:01:16, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 453/1784 [22:43<1:07:37, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 453/1784 [22:43<1:07:37, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 453/1784 [22:43<1:07:37, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 454/1784 [22:47<1:12:17, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 454/1784 [22:47<1:12:17, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▊ | 454/1784 [22:47<1:12:17, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:44,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 455/1784 [22:50<1:14:58, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 456/1784 [22:54<1:16:48, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 456/1784 [22:54<1:16:48, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.164, 'learning_rate': 9.040000000000002e-06, 'epoch': 0.26} 26%|███████████████████▉ | 456/1784 [22:54<1:16:48, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 457/1784 [22:58<1:17:54, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 457/1784 [22:58<1:17:54, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|███████████████████▉ | 457/1784 [22:58<1:17:54, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████ | 458/1784 [23:01<1:18:35, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████ | 458/1784 [23:01<1:18:35, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████ | 458/1784 [23:01<1:18:35, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 09:59:59,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████ | 459/1784 [23:05<1:18:39, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████ | 460/1784 [23:08<1:18:55, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████ | 460/1784 [23:08<1:18:55, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.161, 'learning_rate': 9.12e-06, 'epoch': 0.26} 26%|████████████████████ | 460/1784 [23:08<1:18:55, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 461/1784 [23:12<1:18:48, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 461/1784 [23:12<1:18:48, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 461/1784 [23:12<1:18:48, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▏ | 462/1784 [23:16<1:18:23, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:00:26,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:00:26,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1192, 'learning_rate': 9.180000000000002e-06, 'epoch': 0.26} [WARNING|modeling_utils.py:388] 2022-02-28 10:00:26,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 464/1784 [23:23<1:17:38, 3.53s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 464/1784 [23:23<1:17:38, 3.53s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 464/1784 [23:23<1:17:38, 3.53s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▎ | 465/1784 [23:26<1:16:54, 3.50s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:00:36,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:00:36,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2841, 'learning_rate': 9.240000000000001e-06, 'epoch': 0.26} 26%|████████████████████▍ | 467/1784 [23:33<1:15:59, 3.46s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 467/1784 [23:33<1:15:59, 3.46s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1138, 'learning_rate': 9.260000000000001e-06, 'epoch': 0.26} 26%|████████████████████▍ | 467/1784 [23:33<1:15:59, 3.46s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 468/1784 [23:36<1:15:26, 3.44s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 468/1784 [23:36<1:15:26, 3.44s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 468/1784 [23:36<1:15:26, 3.44s/it]g-point operations will not be computed-28 10:00:13,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 469/1784 [23:40<1:15:12, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 469/1784 [23:40<1:15:12, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 470/1784 [23:43<1:14:36, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 470/1784 [23:43<1:14:36, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 470/1784 [23:43<1:14:36, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▌ | 471/1784 [23:46<1:14:29, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:00:57,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:00:57,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1338, 'learning_rate': 9.360000000000002e-06, 'epoch': 0.26} [WARNING|modeling_utils.py:388] 2022-02-28 10:00:57,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▋ | 473/1784 [23:53<1:13:29, 3.36s/it]g-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▋ | 473/1784 [23:53<1:13:29, 3.36s/it]g-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▋ | 473/1784 [23:53<1:13:29, 3.36s/it]g-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▋ | 474/1784 [23:56<1:13:16, 3.36s/it]g-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:06,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:06,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2336, 'learning_rate': 9.42e-06, 'epoch': 0.27} [WARNING|modeling_utils.py:388] 2022-02-28 10:01:06,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▊ | 476/1784 [24:03<1:12:08, 3.31s/it]g-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▊ | 476/1784 [24:03<1:12:08, 3.31s/it]g-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▊ | 476/1784 [24:03<1:12:08, 3.31s/it]g-point operations will not be computed-28 10:00:48,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▊ | 477/1784 [24:06<1:11:13, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:15,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▉ | 478/1784 [24:09<1:10:17, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:15,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▉ | 478/1784 [24:09<1:10:17, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:15,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2287, 'learning_rate': 9.48e-06, 'epoch': 0.27} 27%|████████████████████▉ | 478/1784 [24:09<1:10:17, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:15,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▉ | 479/1784 [24:12<1:09:47, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:21,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▉ | 480/1784 [24:15<1:09:12, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:21,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|████████████████████▉ | 480/1784 [24:15<1:09:12, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:21,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1659, 'learning_rate': 9.52e-06, 'epoch': 0.27} 27%|████████████████████▉ | 480/1784 [24:15<1:09:12, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:21,300 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 481/1784 [24:19<1:08:31, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:27,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 482/1784 [24:22<1:08:22, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:27,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 482/1784 [24:22<1:08:22, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:27,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1384, 'learning_rate': 9.56e-06, 'epoch': 0.27} 27%|█████████████████████ | 482/1784 [24:22<1:08:22, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:27,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████ | 483/1784 [24:25<1:07:43, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 484/1784 [24:28<1:06:50, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 484/1784 [24:28<1:06:50, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0477, 'learning_rate': 9.600000000000001e-06, 'epoch': 0.27} 27%|█████████████████████▏ | 484/1784 [24:28<1:06:50, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 485/1784 [24:31<1:05:54, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▏ | 485/1784 [24:31<1:05:54, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:40,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:40,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:40,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:33,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 487/1784 [24:36<1:04:05, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 487/1784 [24:36<1:04:05, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 488/1784 [24:39<1:02:44, 2.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 488/1784 [24:39<1:02:44, 2.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:49,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:51,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:01:51,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2926, 'learning_rate': 9.72e-06, 'epoch': 0.27} [WARNING|modeling_utils.py:388] 2022-02-28 10:01:51,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 491/1784 [24:47<58:42, 2.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:55,786 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 491/1784 [24:47<58:42, 2.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:55,786 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 492/1784 [24:50<56:26, 2.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 492/1784 [24:50<56:26, 2.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 493/1784 [24:52<54:05, 2.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 493/1784 [24:52<54:05, 2.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:01,291 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:01,291 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:03,287 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:03,287 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:05,095 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:05,095 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:08,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:08,102 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6219, 'learning_rate': 9.88e-06, 'epoch': 0.28} [WARNING|modeling_utils.py:388] 2022-02-28 10:02:09,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:02:09,377 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2369] 2022-02-28 10:02:11,254 >> Batch size = 8aluation *****e number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 0/331 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 2/331 [00:02<06:45, 1.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 2/331 [00:02<06:45, 1.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 3/331 [00:04<08:58, 1.64s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 4/331 [00:06<10:13, 1.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 5/331 [00:09<11:45, 2.16s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 6/331 [00:12<12:45, 2.35s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 7/331 [00:14<13:02, 2.41s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|██ | 8/331 [00:17<13:25, 2.49s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 9/331 [00:20<14:00, 2.61s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 10/331 [00:23<15:00, 2.80s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 11/331 [00:26<14:39, 2.75s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 12/331 [00:29<14:33, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 13/331 [00:31<14:17, 2.70s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 14/331 [00:34<14:07, 2.67s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 15/331 [00:37<15:15, 2.90s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 16/331 [00:41<16:05, 3.06s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 17/331 [00:44<16:11, 3.10s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▍ | 18/331 [00:46<14:57, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 19/331 [00:49<14:38, 2.82s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 20/331 [00:51<13:42, 2.64s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 21/331 [00:54<14:18, 2.77s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 22/331 [00:58<15:23, 2.99s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 23/331 [01:02<16:47, 3.27s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 24/331 [01:06<17:46, 3.47s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 25/331 [01:09<17:06, 3.36s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 26/331 [01:11<15:57, 3.14s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 27/331 [01:14<15:58, 3.15s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 28/331 [01:17<15:19, 3.03s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 29/331 [01:20<14:57, 2.97s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 30/331 [01:23<14:19, 2.85s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▋ | 31/331 [01:25<13:39, 2.73s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 32/331 [01:28<13:24, 2.69s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 33/331 [01:30<13:23, 2.69s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▍ | 34/331 [01:33<13:23, 2.71s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 35/331 [01:36<13:36, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 36/331 [01:39<14:06, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 37/331 [01:42<14:49, 3.02s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▍ | 38/331 [01:46<14:57, 3.06s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 39/331 [01:49<15:01, 3.09s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 40/331 [01:51<13:56, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|██████████▏ | 41/331 [01:54<13:24, 2.77s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 42/331 [01:57<14:18, 2.97s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 42/331 [01:57<14:18, 2.97s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 42/331 [01:57<14:18, 2.97s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▉ | 44/331 [02:04<15:16, 3.19s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 45/331 [02:07<14:22, 3.01s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 46/331 [02:09<13:21, 2.81s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▋ | 47/331 [02:11<12:34, 2.66s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 48/331 [02:14<12:55, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▏ | 49/331 [02:17<13:37, 2.90s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▍ | 50/331 [02:20<13:26, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▋ | 51/331 [02:23<13:49, 2.96s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 52/331 [02:26<13:12, 2.84s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▏ | 53/331 [02:29<13:13, 2.85s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▍ | 54/331 [02:31<12:36, 2.73s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 55/331 [02:35<13:38, 2.96s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 56/331 [02:38<13:25, 2.93s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 57/331 [02:40<12:58, 2.84s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 58/331 [02:43<13:25, 2.95s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 59/331 [02:46<12:40, 2.80s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▊ | 60/331 [02:49<12:27, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|███████████████ | 61/331 [02:52<12:45, 2.84s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 62/331 [02:54<12:38, 2.82s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 63/331 [02:58<13:53, 3.11s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▊ | 64/331 [03:01<13:33, 3.05s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 65/331 [03:04<13:16, 2.99s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▎ | 66/331 [03:08<14:23, 3.26s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▌ | 67/331 [03:12<15:05, 3.43s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 68/331 [03:15<15:11, 3.46s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████ | 69/331 [03:18<14:49, 3.40s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████▎ | 70/331 [03:22<14:30, 3.34s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████▌ | 71/331 [03:25<14:34, 3.36s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 72/331 [03:28<14:29, 3.36s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|██████████████████ | 73/331 [03:31<14:00, 3.26s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|██████████████████▎ | 74/331 [03:34<13:43, 3.20s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 75/331 [03:38<13:55, 3.26s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▊ | 76/331 [03:41<13:10, 3.10s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|███████████████████ | 77/331 [03:43<12:47, 3.02s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▎ | 78/331 [03:46<12:11, 2.89s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▌ | 79/331 [03:49<11:48, 2.81s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▊ | 80/331 [03:51<11:40, 2.79s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|████████████████████ | 81/331 [03:55<12:05, 2.90s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▎ | 82/331 [03:57<11:55, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▌ | 83/331 [04:01<12:19, 2.98s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▊ | 84/331 [04:04<13:10, 3.20s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 85/331 [04:07<12:19, 3.01s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▎ | 86/331 [04:10<12:57, 3.17s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▌ | 87/331 [04:13<12:34, 3.09s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 88/331 [04:16<12:12, 3.02s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 89/331 [04:18<11:21, 2.82s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████▎ | 90/331 [04:21<10:45, 2.68s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████▌ | 91/331 [04:24<11:12, 2.80s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 92/331 [04:26<10:28, 2.63s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████ | 93/331 [04:29<10:35, 2.67s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▎ | 94/331 [04:32<10:53, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▌ | 95/331 [04:35<11:02, 2.81s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▊ | 96/331 [04:38<11:10, 2.85s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 97/331 [04:40<10:46, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 98/331 [04:43<11:10, 2.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 99/331 [04:46<11:07, 2.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▍ | 100/331 [04:49<10:41, 2.78s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 101/331 [04:52<10:36, 2.77s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 102/331 [04:55<11:24, 2.99s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▏ | 103/331 [04:58<10:52, 2.86s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▍ | 104/331 [05:00<10:45, 2.85s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▋ | 105/331 [05:03<10:44, 2.85s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 106/331 [05:06<10:43, 2.86s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|██████████████████████████▏ | 107/331 [05:09<10:02, 2.69s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▍ | 108/331 [05:11<09:46, 2.63s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 109/331 [05:14<09:40, 2.61s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▉ | 110/331 [05:17<10:05, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 111/331 [05:19<10:05, 2.75s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▍ | 112/331 [05:22<10:01, 2.75s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 113/331 [05:24<09:31, 2.62s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▉ | 114/331 [05:27<09:37, 2.66s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 115/331 [05:30<09:40, 2.69s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 116/331 [05:33<09:55, 2.77s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 117/331 [05:36<09:57, 2.79s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 118/331 [05:38<09:44, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 119/331 [05:41<09:46, 2.77s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 120/331 [05:44<09:48, 2.79s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 121/331 [05:47<10:18, 2.95s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 122/331 [05:50<10:03, 2.89s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 123/331 [05:54<10:39, 3.07s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▎ | 124/331 [05:57<10:31, 3.05s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 125/331 [06:00<11:05, 3.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 126/331 [06:04<11:09, 3.27s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 127/331 [06:07<11:30, 3.38s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 128/331 [06:11<11:29, 3.39s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 129/331 [06:14<11:14, 3.34s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 130/331 [06:17<11:16, 3.37s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 131/331 [06:21<11:28, 3.44s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 132/331 [06:24<10:56, 3.30s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 133/331 [06:27<10:15, 3.11s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 134/331 [06:29<09:53, 3.01s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 135/331 [06:33<10:01, 3.07s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 136/331 [06:36<10:14, 3.15s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 137/331 [06:40<10:34, 3.27s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 138/331 [06:43<10:45, 3.35s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 139/331 [06:45<09:34, 2.99s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 140/331 [06:49<10:17, 3.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 141/331 [06:52<09:48, 3.10s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 142/331 [06:55<09:35, 3.05s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 143/331 [06:58<09:54, 3.16s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▏ | 144/331 [07:01<09:28, 3.04s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 145/331 [07:04<09:17, 3.00s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 146/331 [07:07<09:43, 3.15s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 147/331 [07:10<09:23, 3.06s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 148/331 [07:13<08:43, 2.86s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 149/331 [07:15<08:16, 2.73s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▋ | 150/331 [07:18<08:38, 2.86s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 151/331 [07:21<08:34, 2.86s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 152/331 [07:23<08:11, 2.75s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 153/331 [07:26<08:01, 2.70s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 154/331 [07:29<08:21, 2.83s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 155/331 [07:32<08:42, 2.97s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 156/331 [07:36<08:54, 3.05s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 157/331 [07:39<09:13, 3.18s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 158/331 [07:43<09:19, 3.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▉ | 159/331 [07:46<09:18, 3.25s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▏ | 160/331 [07:49<08:45, 3.07s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▍ | 161/331 [07:51<08:31, 3.01s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▋ | 162/331 [07:55<08:58, 3.18s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 163/331 [07:58<09:10, 3.28s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▏ | 164/331 [08:01<08:41, 3.12s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 165/331 [08:04<08:28, 3.06s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 166/331 [08:07<08:10, 2.98s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▊ | 167/331 [08:10<08:24, 3.08s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 168/331 [08:13<08:02, 2.96s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 169/331 [08:16<08:08, 3.01s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 170/331 [08:19<07:42, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 171/331 [08:21<07:40, 2.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 172/331 [08:24<07:16, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 173/331 [08:27<07:28, 2.84s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 174/331 [08:30<07:12, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 175/331 [08:33<07:19, 2.82s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 176/331 [08:35<07:06, 2.75s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 177/331 [08:38<07:27, 2.91s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 178/331 [08:42<07:52, 3.09s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 179/331 [08:45<08:13, 3.24s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 180/331 [08:49<08:07, 3.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▎ | 181/331 [08:52<07:59, 3.20s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 182/331 [08:54<07:24, 2.98s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 183/331 [08:57<06:51, 2.78s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 184/331 [08:59<06:25, 2.63s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 185/331 [09:01<05:59, 2.47s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 186/331 [09:04<06:07, 2.53s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▊ | 187/331 [09:07<06:37, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 188/331 [09:10<06:34, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 189/331 [09:12<06:14, 2.63s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▍ | 190/331 [09:14<06:02, 2.57s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 191/331 [09:17<05:56, 2.55s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 192/331 [09:19<05:48, 2.50s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 193/331 [09:23<06:16, 2.73s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 194/331 [09:25<05:56, 2.60s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 195/331 [09:27<05:48, 2.56s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 196/331 [09:30<05:56, 2.64s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 197/331 [09:33<06:09, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 198/331 [09:36<05:53, 2.66s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 199/331 [09:39<05:59, 2.73s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▉ | 200/331 [09:41<05:43, 2.62s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 201/331 [09:43<05:35, 2.58s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 202/331 [09:46<05:42, 2.65s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 203/331 [09:49<05:41, 2.67s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 204/331 [09:52<05:59, 2.83s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 205/331 [09:55<06:03, 2.89s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 206/331 [09:58<06:00, 2.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 207/331 [10:01<06:12, 3.00s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 208/331 [10:05<06:18, 3.08s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 209/331 [10:07<05:45, 2.84s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 210/331 [10:09<05:22, 2.67s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 211/331 [10:12<05:25, 2.71s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 212/331 [10:14<05:10, 2.61s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 213/331 [10:17<05:11, 2.64s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 214/331 [10:19<04:56, 2.54s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 215/331 [10:22<04:42, 2.44s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 216/331 [10:25<05:12, 2.71s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████ | 217/331 [10:28<05:11, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 218/331 [10:31<05:25, 2.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▌ | 219/331 [10:34<05:17, 2.84s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 220/331 [10:36<05:02, 2.73s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 221/331 [10:39<05:04, 2.77s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▎ | 222/331 [10:41<04:52, 2.69s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 223/331 [10:44<04:55, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 224/331 [10:47<04:55, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 225/331 [10:50<04:53, 2.77s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 226/331 [10:53<05:04, 2.90s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 227/331 [10:56<04:57, 2.86s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 228/331 [10:59<04:48, 2.80s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 229/331 [11:01<04:43, 2.78s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▎ | 230/331 [11:04<04:36, 2.74s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▌ | 231/331 [11:07<04:44, 2.85s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 232/331 [11:10<04:37, 2.80s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 233/331 [11:13<04:44, 2.90s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 234/331 [11:15<04:27, 2.76s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 235/331 [11:18<04:17, 2.68s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 236/331 [11:21<04:43, 2.98s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 237/331 [11:25<04:55, 3.14s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 238/331 [11:28<04:50, 3.12s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 239/331 [11:31<04:50, 3.15s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▋ | 240/331 [11:35<04:51, 3.21s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 241/331 [11:38<04:56, 3.29s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 242/331 [11:41<04:54, 3.31s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▍ | 243/331 [11:45<04:53, 3.33s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 244/331 [11:48<04:57, 3.43s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 245/331 [11:52<04:45, 3.32s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 246/331 [11:55<04:56, 3.49s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 247/331 [11:59<04:44, 3.39s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 248/331 [12:01<04:23, 3.18s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 249/331 [12:04<04:01, 2.94s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▏ | 250/331 [12:06<03:46, 2.79s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▍ | 251/331 [12:09<03:46, 2.84s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▋ | 252/331 [12:11<03:34, 2.71s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▉ | 253/331 [12:15<03:44, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▏ | 254/331 [12:17<03:37, 2.83s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▍ | 255/331 [12:21<03:44, 2.95s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▋ | 256/331 [12:23<03:35, 2.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|██████████████████████████████████████████████████████████████▉ | 257/331 [12:27<03:41, 2.99s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████▏ | 258/331 [12:29<03:25, 2.82s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████▍ | 259/331 [12:32<03:20, 2.78s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|███████████████████████████████████████████████████████████████▋ | 260/331 [12:35<03:23, 2.87s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|███████████████████████████████████████████████████████████████▊ | 261/331 [12:37<03:08, 2.69s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████████ | 262/331 [12:40<03:07, 2.72s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████████▎ | 263/331 [12:43<03:16, 2.88s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▌ | 264/331 [12:46<03:06, 2.79s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▊ | 265/331 [12:48<03:01, 2.75s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████████ | 266/331 [12:51<02:54, 2.69s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▎ | 267/331 [12:54<03:04, 2.89s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▌ | 268/331 [12:57<03:02, 2.90s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▊ | 269/331 [13:01<03:10, 3.07s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████ | 270/331 [13:04<03:05, 3.04s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▎ | 271/331 [13:07<03:07, 3.13s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▌ | 272/331 [13:10<02:58, 3.03s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▊ | 273/331 [13:13<02:57, 3.05s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████ | 274/331 [13:16<03:01, 3.19s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████▎ | 275/331 [13:20<03:02, 3.26s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████▌ | 276/331 [13:22<02:48, 3.06s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|███████████████████████████████████████████████████████████████████▊ | 277/331 [13:25<02:41, 3.00s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████████ | 278/331 [13:28<02:37, 2.96s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████████▎ | 279/331 [13:32<02:45, 3.18s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▌ | 280/331 [13:35<02:38, 3.11s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▊ | 281/331 [13:38<02:41, 3.22s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████████ | 282/331 [13:41<02:38, 3.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████████▎ | 283/331 [13:45<02:39, 3.32s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▍ | 284/331 [13:49<02:40, 3.42s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▋ | 285/331 [13:52<02:39, 3.48s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▉ | 286/331 [13:56<02:36, 3.47s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▏ | 287/331 [14:00<02:37, 3.58s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▍ | 288/331 [14:03<02:33, 3.56s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▋ | 289/331 [14:06<02:21, 3.36s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|██████████████████████████████████████████████████████████████████████▉ | 290/331 [14:09<02:10, 3.19s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████▏ | 291/331 [14:11<02:00, 3.02s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████▍ | 292/331 [14:14<01:54, 2.94s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|███████████████████████████████████████████████████████████████████████▋ | 293/331 [14:17<01:51, 2.93s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|███████████████████████████████████████████████████████████████████████▉ | 294/331 [14:20<01:43, 2.80s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████████▏ | 295/331 [14:22<01:38, 2.73s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████████▍ | 296/331 [14:25<01:32, 2.64s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▋ | 297/331 [14:28<01:38, 2.91s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▉ | 298/331 [14:32<01:44, 3.16s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████████▏ | 299/331 [14:35<01:37, 3.04s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▍ | 300/331 [14:38<01:33, 3.01s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▋ | 301/331 [14:40<01:29, 2.99s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▉ | 302/331 [14:43<01:25, 2.94s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▏ | 303/331 [14:46<01:19, 2.83s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▍ | 304/331 [14:49<01:19, 2.93s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▋ | 305/331 [14:52<01:18, 3.03s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▉ | 306/331 [14:56<01:20, 3.22s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▏ | 307/331 [15:00<01:20, 3.35s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▎ | 308/331 [15:04<01:21, 3.55s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▌ | 309/331 [15:07<01:18, 3.56s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|███████████████████████████████████████████████████████████████████████████▊ | 310/331 [15:10<01:09, 3.33s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████████ | 311/331 [15:13<01:06, 3.34s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████████▎ | 312/331 [15:16<00:59, 3.13s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▌ | 313/331 [15:19<00:55, 3.06s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▊ | 314/331 [15:22<00:52, 3.09s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████████ | 315/331 [15:25<00:50, 3.15s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████████▎ | 316/331 [15:29<00:47, 3.17s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████████▌ | 317/331 [15:32<00:46, 3.32s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████████▊ | 318/331 [15:35<00:41, 3.16s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████████ | 319/331 [15:38<00:36, 3.03s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▎ | 320/331 [15:41<00:33, 3.08s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▌ | 321/331 [15:44<00:30, 3.00s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▊ | 322/331 [15:47<00:28, 3.15s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████ | 323/331 [15:50<00:24, 3.07s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▎ | 324/331 [15:54<00:22, 3.17s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▌ | 325/331 [15:57<00:19, 3.18s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▊ | 326/331 [16:00<00:16, 3.23s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████ | 327/331 [16:03<00:12, 3.24s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████▎| 328/331 [16:07<00:09, 3.27s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████▌| 329/331 [16:10<00:06, 3.21s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████████▊| 330/331 [16:14<00:03, 3.38s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|█████████████████████████████████████████████████████████████████████████████████| 331/331 [16:15<00:00, 2.95s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|█████████████████████████████████████████████████████████████████████████████████| 331/331 [16:15<00:00, 2.95s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 02/28/2022 10:18:30 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow [INFO|configuration_utils.py:438] 2022-02-28 10:18:30,523 >> Configuration saved in ./checkpoint-500/config.json g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 10:18:35,798 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 10:18:35,798 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 10:18:35,798 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 02/28/2022 10:18:56 - WARNING - huggingface_hub.repository - Adding files tracked by Git LFS: ['wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb']. This may take a bit of time if the files are large. [INFO|feature_extraction_utils.py:324] 2022-02-28 10:18:35,798 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 10:18:35,798 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████ | 501/1784 [42:20<111:13:48, 312.10s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████ | 501/1784 [42:20<111:13:48, 312.10s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████ | 501/1784 [42:20<111:13:48, 312.10s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 502/1784 [42:24<78:13:09, 219.65s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 502/1784 [42:24<78:13:09, 219.65s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 502/1784 [42:24<78:13:09, 219.65s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 503/1784 [42:28<55:06:36, 154.88s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 503/1784 [42:28<55:06:36, 154.88s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 503/1784 [42:28<55:06:36, 154.88s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 504/1784 [42:32<38:56:47, 109.54s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 504/1784 [42:32<38:56:47, 109.54s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▍ | 504/1784 [42:32<38:56:47, 109.54s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 505/1784 [42:35<27:38:35, 77.81s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 505/1784 [42:35<27:38:35, 77.81s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 505/1784 [42:35<27:38:35, 77.81s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 506/1784 [42:39<19:43:29, 55.56s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▊ | 506/1784 [42:39<19:43:29, 55.56s/it]onfig.jsonerations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:19:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:19:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:19:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▉ | 508/1784 [42:46<10:19:16, 29.12s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▉ | 508/1784 [42:46<10:19:16, 29.12s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|█████████████████████▉ | 508/1784 [42:46<10:19:16, 29.12s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▎ | 509/1784 [42:50<7:36:17, 21.47s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▎ | 509/1784 [42:50<7:36:17, 21.47s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▎ | 509/1784 [42:50<7:36:17, 21.47s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▎ | 510/1784 [42:54<5:42:21, 16.12s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▎ | 510/1784 [42:54<5:42:21, 16.12s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:20:04,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:20:04,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:20:04,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▍ | 512/1784 [43:01<3:25:47, 9.71s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▍ | 512/1784 [43:01<3:25:47, 9.71s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▍ | 512/1784 [43:01<3:25:47, 9.71s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▍ | 513/1784 [43:04<2:46:23, 7.86s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:20:15,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:20:15,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0243, 'learning_rate': 9.922118380062306e-06, 'epoch': 0.29} [WARNING|modeling_utils.py:388] 2022-02-28 10:20:15,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 515/1784 [43:11<1:58:57, 5.62s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 515/1784 [43:11<1:58:57, 5.62s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 515/1784 [43:11<1:58:57, 5.62s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 516/1784 [43:15<1:45:43, 5.00s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 516/1784 [43:15<1:45:43, 5.00s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 516/1784 [43:15<1:45:43, 5.00s/it]g-point operations will not be computed-28 10:01:58,121 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 517/1784 [43:18<1:36:13, 4.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:27,580 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▌ | 517/1784 [43:18<1:36:13, 4.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:27,580 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 518/1784 [43:22<1:30:03, 4.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 518/1784 [43:22<1:30:03, 4.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 519/1784 [43:26<1:25:12, 4.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 519/1784 [43:26<1:25:12, 4.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 519/1784 [43:26<1:25:12, 4.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 520/1784 [43:29<1:21:57, 3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 520/1784 [43:29<1:21:57, 3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▋ | 520/1784 [43:29<1:21:57, 3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:31,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 521/1784 [43:33<1:19:39, 3.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:41,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 521/1784 [43:33<1:19:39, 3.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:41,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 522/1784 [43:36<1:17:20, 3.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:41,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 522/1784 [43:36<1:17:20, 3.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:41,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 522/1784 [43:36<1:17:20, 3.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:41,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 523/1784 [43:39<1:15:52, 3.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:41,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▊ | 523/1784 [43:39<1:15:52, 3.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:41,671 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 524/1784 [43:43<1:14:48, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 524/1784 [43:43<1:14:48, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 525/1784 [43:46<1:13:32, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 525/1784 [43:46<1:13:32, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3807, 'learning_rate': 9.836448598130843e-06, 'epoch': 0.29} 29%|██████████████████████▉ | 526/1784 [43:50<1:12:22, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|██████████████████████▉ | 526/1784 [43:50<1:12:22, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:00,257 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:00,257 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3896, 'learning_rate': 9.820872274143302e-06, 'epoch': 0.3} [WARNING|modeling_utils.py:388] 2022-02-28 10:21:00,257 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████ | 528/1784 [43:56<1:10:42, 3.38s/it]g-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████ | 528/1784 [43:56<1:10:42, 3.38s/it]g-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████ | 528/1784 [43:56<1:10:42, 3.38s/it]g-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▏ | 529/1784 [44:00<1:10:01, 3.35s/it]g-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▏ | 529/1784 [44:00<1:10:01, 3.35s/it]g-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:10,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:10,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:10,071 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▏ | 531/1784 [44:06<1:08:29, 3.28s/it]g-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▏ | 531/1784 [44:06<1:08:29, 3.28s/it]g-point operations will not be computed-28 10:20:51,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 532/1784 [44:09<1:07:59, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:18,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 532/1784 [44:09<1:07:59, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:18,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9538, 'learning_rate': 9.781931464174455e-06, 'epoch': 0.3} 30%|███████████████████████▎ | 533/1784 [44:12<1:07:12, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:18,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 533/1784 [44:12<1:07:12, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:18,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 533/1784 [44:12<1:07:12, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:18,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 534/1784 [44:15<1:06:13, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▎ | 534/1784 [44:15<1:06:13, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 535/1784 [44:18<1:04:47, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 535/1784 [44:18<1:04:47, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:28,633 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:28,633 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:28,633 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 537/1784 [44:24<1:03:00, 3.03s/it]g-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|███████████████████████▍ | 537/1784 [44:24<1:03:00, 3.03s/it]g-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:34,390 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:34,390 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:34,390 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:24,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 539/1784 [44:30<59:56, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 539/1784 [44:30<59:56, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 540/1784 [44:33<59:05, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▏ | 540/1784 [44:33<59:05, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:42,408 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:44,809 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:44,809 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3842, 'learning_rate': 9.70404984423676e-06, 'epoch': 0.3} [WARNING|modeling_utils.py:388] 2022-02-28 10:21:47,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:49,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:49,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:51,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:51,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:53,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:53,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:54,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:54,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6969, 'learning_rate': 9.665109034267914e-06, 'epoch': 0.31} {'loss': 4.4121, 'learning_rate': 9.657320872274144e-06, 'epoch': 0.31} [WARNING|modeling_utils.py:388] 2022-02-28 10:21:57,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:57,298 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:59,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:59,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:21:59,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:03,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:03,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:03,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 552/1784 [45:00<58:16, 2.84s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 552/1784 [45:00<58:16, 2.84s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▊ | 552/1784 [45:00<58:16, 2.84s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 553/1784 [45:03<1:04:08, 3.13s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 553/1784 [45:03<1:04:08, 3.13s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 553/1784 [45:03<1:04:08, 3.13s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 554/1784 [45:07<1:07:43, 3.30s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 554/1784 [45:07<1:07:43, 3.30s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▏ | 554/1784 [45:07<1:07:43, 3.30s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 555/1784 [45:11<1:12:06, 3.52s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 555/1784 [45:11<1:12:06, 3.52s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 555/1784 [45:11<1:12:06, 3.52s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 556/1784 [45:15<1:13:59, 3.62s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 556/1784 [45:15<1:13:59, 3.62s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 556/1784 [45:15<1:13:59, 3.62s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 557/1784 [45:19<1:14:22, 3.64s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▎ | 557/1784 [45:19<1:14:22, 3.64s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:29,715 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:29,715 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:29,715 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 559/1784 [45:26<1:13:51, 3.62s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 559/1784 [45:26<1:13:51, 3.62s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 559/1784 [45:26<1:13:51, 3.62s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 560/1784 [45:30<1:13:33, 3.61s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 560/1784 [45:30<1:13:33, 3.61s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▍ | 560/1784 [45:30<1:13:33, 3.61s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 561/1784 [45:33<1:13:05, 3.59s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▌ | 561/1784 [45:33<1:13:05, 3.59s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:43,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:43,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:43,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▌ | 563/1784 [45:40<1:12:02, 3.54s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▌ | 563/1784 [45:40<1:12:02, 3.54s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▌ | 563/1784 [45:40<1:12:02, 3.54s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▋ | 564/1784 [45:44<1:11:52, 3.53s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:54,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:22:54,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3194, 'learning_rate': 9.524922118380064e-06, 'epoch': 0.32} [WARNING|modeling_utils.py:388] 2022-02-28 10:22:54,371 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▋ | 566/1784 [45:51<1:11:09, 3.51s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▋ | 566/1784 [45:51<1:11:09, 3.51s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▋ | 566/1784 [45:51<1:11:09, 3.51s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▊ | 567/1784 [45:54<1:11:09, 3.51s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▊ | 567/1784 [45:54<1:11:09, 3.51s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▊ | 567/1784 [45:54<1:11:09, 3.51s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▊ | 568/1784 [45:58<1:10:54, 3.50s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▊ | 568/1784 [45:58<1:10:54, 3.50s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:08,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:08,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:08,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 570/1784 [46:04<1:09:58, 3.46s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 570/1784 [46:04<1:09:58, 3.46s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 570/1784 [46:04<1:09:58, 3.46s/it]g-point operations will not be computed-28 10:21:38,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 571/1784 [46:08<1:09:08, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|████████████████████████▉ | 571/1784 [46:08<1:09:08, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████ | 572/1784 [46:11<1:08:45, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████ | 572/1784 [46:11<1:08:45, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████ | 572/1784 [46:11<1:08:45, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████ | 573/1784 [46:14<1:08:04, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████ | 573/1784 [46:14<1:08:04, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:24,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:24,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:24,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 575/1784 [46:21<1:07:01, 3.33s/it]g-point operations will not be computed-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 575/1784 [46:21<1:07:01, 3.33s/it]g-point operations will not be computed-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 575/1784 [46:21<1:07:01, 3.33s/it]g-point operations will not be computed-28 10:23:16,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 576/1784 [46:24<1:06:12, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:33,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 576/1784 [46:24<1:06:12, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:33,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 577/1784 [46:27<1:05:22, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:33,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 577/1784 [46:27<1:05:22, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:33,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▏ | 577/1784 [46:27<1:05:22, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:33,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 578/1784 [46:30<1:04:56, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:39,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 578/1784 [46:30<1:04:56, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:39,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 579/1784 [46:34<1:04:22, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:39,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 579/1784 [46:34<1:04:22, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:39,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▎ | 579/1784 [46:34<1:04:22, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:39,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▎ | 580/1784 [46:37<1:03:56, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:45,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▎ | 580/1784 [46:37<1:03:56, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:45,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▍ | 581/1784 [46:40<1:03:23, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:45,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▍ | 581/1784 [46:40<1:03:23, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:45,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▍ | 581/1784 [46:40<1:03:23, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:45,719 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▍ | 582/1784 [46:43<1:02:53, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▍ | 582/1784 [46:43<1:02:53, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|█████████████████████████▍ | 583/1784 [46:46<1:01:37, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:56,108 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:23:56,108 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4636, 'learning_rate': 9.376947040498443e-06, 'epoch': 0.33} [WARNING|modeling_utils.py:388] 2022-02-28 10:23:56,108 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▏ | 585/1784 [46:52<59:41, 2.99s/it]g-point operations will not be computed-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:24:01,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:24:01,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1328, 'learning_rate': 9.361370716510904e-06, 'epoch': 0.33} [WARNING|modeling_utils.py:388] 2022-02-28 10:24:01,840 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:23:51,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▎ | 587/1784 [46:57<57:43, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:06,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▎ | 587/1784 [46:57<57:43, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:06,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▎ | 588/1784 [47:00<56:15, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:06,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▎ | 588/1784 [47:00<56:15, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:06,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:24:09,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:06,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:24:09,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:06,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:24:09,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:06,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▍ | 590/1784 [47:05<53:22, 2.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:13,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▍ | 590/1784 [47:05<53:22, 2.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:13,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 591/1784 [47:07<51:30, 2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 591/1784 [47:07<51:30, 2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:15,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 592/1784 [47:10<49:11, 2.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:18,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 592/1784 [47:10<49:11, 2.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:18,109 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 593/1784 [47:12<47:05, 2.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:20,164 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▌ | 593/1784 [47:12<47:05, 2.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:20,164 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 594/1784 [47:14<44:38, 2.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:22,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 594/1784 [47:14<44:38, 2.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:22,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 595/1784 [47:16<41:55, 2.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:23,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 595/1784 [47:16<41:55, 2.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:23,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▊ | 597/1784 [47:19<35:45, 1.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:25,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▊ | 597/1784 [47:19<35:45, 1.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:25,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.488, 'learning_rate': 9.27570093457944e-06, 'epoch': 0.33} 34%|██████████████████████████▊ | 598/1784 [47:20<32:59, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:27,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 598/1784 [47:20<32:59, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:27,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 600/1784 [47:23<31:40, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:29,186 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 600/1784 [47:23<31:40, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:29,186 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 600/1784 [47:23<31:40, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:32,363 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 601/1784 [47:27<45:45, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:32,363 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 601/1784 [47:27<45:45, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:32,363 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 601/1784 [47:27<45:45, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 601/1784 [47:27<45:45, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 602/1784 [47:31<54:12, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 602/1784 [47:31<54:12, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 602/1784 [47:31<54:12, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████ | 603/1784 [47:34<59:51, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████ | 603/1784 [47:34<59:51, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████ | 603/1784 [47:34<59:51, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▍ | 604/1784 [47:38<1:03:46, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▍ | 604/1784 [47:38<1:03:46, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▍ | 604/1784 [47:38<1:03:46, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▍ | 605/1784 [47:42<1:06:15, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:24:52,779 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:24:52,779 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0453, 'learning_rate': 9.205607476635515e-06, 'epoch': 0.34} [WARNING|modeling_utils.py:388] 2022-02-28 10:24:52,779 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▌ | 607/1784 [47:49<1:08:51, 3.51s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▌ | 607/1784 [47:49<1:08:51, 3.51s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▌ | 607/1784 [47:49<1:08:51, 3.51s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▌ | 608/1784 [47:53<1:09:19, 3.54s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▌ | 608/1784 [47:53<1:09:19, 3.54s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▌ | 608/1784 [47:53<1:09:19, 3.54s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▋ | 609/1784 [47:56<1:09:34, 3.55s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:25:07,127 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:25:07,127 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0926, 'learning_rate': 9.174454828660438e-06, 'epoch': 0.34} [WARNING|modeling_utils.py:388] 2022-02-28 10:25:07,127 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▋ | 611/1784 [48:03<1:09:13, 3.54s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▋ | 611/1784 [48:03<1:09:13, 3.54s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▋ | 611/1784 [48:03<1:09:13, 3.54s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 612/1784 [48:07<1:08:53, 3.53s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 612/1784 [48:07<1:08:53, 3.53s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 612/1784 [48:07<1:08:53, 3.53s/it]g-point operations will not be computed-28 10:24:36,212 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 613/1784 [48:10<1:08:21, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 613/1784 [48:10<1:08:21, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 614/1784 [48:14<1:07:51, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 614/1784 [48:14<1:07:51, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▊ | 614/1784 [48:14<1:07:51, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 615/1784 [48:17<1:07:25, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 615/1784 [48:17<1:07:25, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|██████████████████████████▉ | 615/1784 [48:17<1:07:25, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:19,395 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|██████████████████████████▉ | 616/1784 [48:21<1:07:03, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|██████████████████████████▉ | 616/1784 [48:21<1:07:03, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|██████████████████████████▉ | 617/1784 [48:24<1:07:05, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|██████████████████████████▉ | 617/1784 [48:24<1:07:05, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|██████████████████████████▉ | 617/1784 [48:24<1:07:05, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 618/1784 [48:27<1:06:28, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 618/1784 [48:27<1:06:28, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 618/1784 [48:27<1:06:28, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:29,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 619/1784 [48:31<1:06:00, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 619/1784 [48:31<1:06:00, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 620/1784 [48:34<1:05:15, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 620/1784 [48:34<1:05:15, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████ | 620/1784 [48:34<1:05:15, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▏ | 621/1784 [48:37<1:04:54, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:25:47,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:25:47,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1339, 'learning_rate': 9.080996884735204e-06, 'epoch': 0.35} [WARNING|modeling_utils.py:388] 2022-02-28 10:25:47,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▏ | 623/1784 [48:44<1:04:25, 3.33s/it]g-point operations will not be computed-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▏ | 623/1784 [48:44<1:04:25, 3.33s/it]g-point operations will not be computed-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▏ | 623/1784 [48:44<1:04:25, 3.33s/it]g-point operations will not be computed-28 10:25:39,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▎ | 624/1784 [48:47<1:03:45, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:56,129 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▎ | 624/1784 [48:47<1:03:45, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:56,129 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▎ | 625/1784 [48:50<1:03:24, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:56,129 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▎ | 625/1784 [48:50<1:03:24, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:56,129 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▎ | 625/1784 [48:50<1:03:24, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:25:56,129 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▎ | 626/1784 [48:54<1:02:59, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▎ | 626/1784 [48:54<1:02:59, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 627/1784 [48:57<1:02:38, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 627/1784 [48:57<1:02:38, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 627/1784 [48:57<1:02:38, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▍ | 628/1784 [49:00<1:02:18, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:10,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:10,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.474, 'learning_rate': 9.026479750778817e-06, 'epoch': 0.35} [WARNING|modeling_utils.py:388] 2022-02-28 10:26:10,527 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 630/1784 [49:06<1:01:24, 3.19s/it]g-point operations will not be computed-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 630/1784 [49:06<1:01:24, 3.19s/it]g-point operations will not be computed-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 630/1784 [49:06<1:01:24, 3.19s/it]g-point operations will not be computed-28 10:26:02,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 631/1784 [49:09<1:00:41, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:18,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|███████████████████████████▌ | 631/1784 [49:09<1:00:41, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:18,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 632/1784 [49:12<59:53, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:18,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:22,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:18,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:22,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:18,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.126, 'learning_rate': 8.995327102803739e-06, 'epoch': 0.35} [WARNING|modeling_utils.py:388] 2022-02-28 10:26:22,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:18,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 634/1784 [49:18<58:22, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:27,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 635/1784 [49:21<57:35, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:27,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 635/1784 [49:21<57:35, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:27,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2611, 'learning_rate': 8.9797507788162e-06, 'epoch': 0.36} 36%|████████████████████████████▍ | 635/1784 [49:21<57:35, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:27,260 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▌ | 636/1784 [49:24<56:46, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:32,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▌ | 636/1784 [49:24<56:46, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:32,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▌ | 637/1784 [49:27<55:36, 2.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:32,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:37,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:32,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:37,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:32,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.949, 'learning_rate': 8.956386292834892e-06, 'epoch': 0.36} [WARNING|modeling_utils.py:388] 2022-02-28 10:26:37,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:32,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 639/1784 [49:32<53:02, 2.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:40,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 639/1784 [49:32<53:02, 2.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:40,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 640/1784 [49:35<51:15, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 640/1784 [49:35<51:15, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 641/1784 [49:37<49:49, 2.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 641/1784 [49:37<49:49, 2.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:46,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:46,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:48,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:48,982 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:51,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:51,002 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:52,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:52,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:54,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:54,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:57,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:57,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:58,923 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:26:58,923 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7584, 'learning_rate': 8.870716510903428e-06, 'epoch': 0.36} [WARNING|modeling_utils.py:388] 2022-02-28 10:27:00,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:27:00,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:27:00,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:27:04,705 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:27:04,705 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:27:04,705 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 652/1784 [50:01<52:52, 2.80s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 652/1784 [50:01<52:52, 2.80s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 652/1784 [50:01<52:52, 2.80s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 653/1784 [50:05<58:16, 3.09s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 653/1784 [50:05<58:16, 3.09s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 653/1784 [50:05<58:16, 3.09s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▌ | 654/1784 [50:09<1:01:44, 3.28s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▌ | 654/1784 [50:09<1:01:44, 3.28s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▌ | 654/1784 [50:09<1:01:44, 3.28s/it]g-point operations will not be computed-28 10:26:43,389 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 655/1784 [50:12<1:03:42, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 655/1784 [50:12<1:03:42, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 656/1784 [50:16<1:05:04, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 656/1784 [50:16<1:05:04, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 656/1784 [50:16<1:05:04, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 657/1784 [50:20<1:05:45, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 657/1784 [50:20<1:05:45, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▋ | 657/1784 [50:20<1:05:45, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 658/1784 [50:23<1:06:21, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 658/1784 [50:23<1:06:21, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 658/1784 [50:23<1:06:21, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:21,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 659/1784 [50:27<1:06:49, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 659/1784 [50:27<1:06:49, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 660/1784 [50:30<1:06:42, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 660/1784 [50:30<1:06:42, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▊ | 660/1784 [50:30<1:06:42, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▉ | 661/1784 [50:34<1:06:30, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▉ | 661/1784 [50:34<1:06:30, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▉ | 661/1784 [50:34<1:06:30, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|████████████████████████████▉ | 662/1784 [50:37<1:06:13, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:27:48,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:27:48,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2834, 'learning_rate': 8.761682242990654e-06, 'epoch': 0.37} [WARNING|modeling_utils.py:388] 2022-02-28 10:27:48,177 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████ | 664/1784 [50:44<1:05:35, 3.51s/it]g-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████ | 664/1784 [50:44<1:05:35, 3.51s/it]g-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████ | 664/1784 [50:44<1:05:35, 3.51s/it]g-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████ | 665/1784 [50:48<1:05:26, 3.51s/it]g-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████ | 665/1784 [50:48<1:05:26, 3.51s/it]g-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████ | 665/1784 [50:48<1:05:26, 3.51s/it]g-point operations will not be computed-28 10:27:35,943 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████ | 666/1784 [50:51<1:05:22, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 667/1784 [50:55<1:05:04, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 667/1784 [50:55<1:05:04, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1687, 'learning_rate': 8.730529595015576e-06, 'epoch': 0.37} 37%|█████████████████████████████▏ | 667/1784 [50:55<1:05:04, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▏ | 668/1784 [50:58<1:04:40, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:08,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:08,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1119, 'learning_rate': 8.714953271028038e-06, 'epoch': 0.38} [WARNING|modeling_utils.py:388] 2022-02-28 10:28:08,946 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▎ | 670/1784 [51:05<1:03:30, 3.42s/it]g-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▎ | 670/1784 [51:05<1:03:30, 3.42s/it]g-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▎ | 670/1784 [51:05<1:03:30, 3.42s/it]g-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▎ | 671/1784 [51:08<1:02:57, 3.39s/it]g-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:19,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:19,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3189, 'learning_rate': 8.69158878504673e-06, 'epoch': 0.38} [WARNING|modeling_utils.py:388] 2022-02-28 10:28:19,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▍ | 673/1784 [51:15<1:02:38, 3.38s/it]g-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▍ | 673/1784 [51:15<1:02:38, 3.38s/it]g-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▍ | 673/1784 [51:15<1:02:38, 3.38s/it]g-point operations will not be computed-28 10:28:00,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▍ | 674/1784 [51:18<1:02:21, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▌ | 675/1784 [51:22<1:02:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▌ | 675/1784 [51:22<1:02:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9901, 'learning_rate': 8.668224299065421e-06, 'epoch': 0.38} 38%|█████████████████████████████▌ | 675/1784 [51:22<1:02:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▌ | 676/1784 [51:25<1:01:43, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:35,601 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:35,601 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1611, 'learning_rate': 8.652647975077882e-06, 'epoch': 0.38} [WARNING|modeling_utils.py:388] 2022-02-28 10:28:35,601 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▋ | 678/1784 [51:31<1:00:32, 3.28s/it]g-point operations will not be computed-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▋ | 678/1784 [51:31<1:00:32, 3.28s/it]g-point operations will not be computed-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|█████████████████████████████▋ | 678/1784 [51:31<1:00:32, 3.28s/it]g-point operations will not be computed-28 10:28:27,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 679/1784 [51:35<59:58, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 679/1784 [51:35<59:58, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 680/1784 [51:38<59:27, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 680/1784 [51:38<59:27, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▍ | 680/1784 [51:38<59:27, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 681/1784 [51:41<58:55, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 681/1784 [51:41<58:55, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:51,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:51,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:51,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 683/1784 [51:47<57:40, 3.14s/it]g-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 683/1784 [51:47<57:40, 3.14s/it]g-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:57,517 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:57,517 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:28:57,517 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:28:43,624 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 685/1784 [51:53<56:06, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:02,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▋ | 685/1784 [51:53<56:06, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:02,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 686/1784 [51:56<55:16, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:02,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 686/1784 [51:56<55:16, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:02,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 686/1784 [51:56<55:16, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:02,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|██████████████████████████████▊ | 687/1784 [51:59<54:16, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|██████████████████████████████▊ | 687/1784 [51:59<54:16, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|██████████████████████████████▊ | 688/1784 [52:02<53:23, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:29:11,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:29:11,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:29:14,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:29:14,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2714, 'learning_rate': 8.55140186915888e-06, 'epoch': 0.39} [WARNING|modeling_utils.py:388] 2022-02-28 10:29:14,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:29:07,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|██████████████████████████████▉ | 691/1784 [52:10<49:14, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:18,192 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|██████████████████████████████▉ | 691/1784 [52:10<49:14, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:18,192 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 692/1784 [52:12<47:30, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:20,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 692/1784 [52:12<47:30, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:20,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 693/1784 [52:14<45:12, 2.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:22,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 693/1784 [52:14<45:12, 2.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:22,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 694/1784 [52:16<42:38, 2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:24,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████ | 694/1784 [52:16<42:38, 2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:24,535 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▏ | 695/1784 [52:18<40:00, 2.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:26,291 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▏ | 695/1784 [52:18<40:00, 2.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:26,291 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 697/1784 [52:21<33:48, 1.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:27,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 697/1784 [52:21<33:48, 1.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:27,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9704, 'learning_rate': 8.496884735202494e-06, 'epoch': 0.39} 39%|███████████████████████████████▎ | 698/1784 [52:22<30:50, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:30,482 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 698/1784 [52:22<30:50, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:30,482 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 700/1784 [52:25<29:05, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:31,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 700/1784 [52:25<29:05, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:31,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 700/1784 [52:26<29:05, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:34,870 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 701/1784 [52:29<41:47, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:34,870 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 701/1784 [52:29<41:47, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:34,870 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 701/1784 [52:29<41:47, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 702/1784 [52:33<49:36, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▍ | 702/1784 [52:33<49:36, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1611, 'learning_rate': 8.457943925233646e-06, 'epoch': 0.39} 39%|███████████████████████████████▌ | 703/1784 [52:37<54:49, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 703/1784 [52:37<54:49, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1531, 'learning_rate': 8.450155763239875e-06, 'epoch': 0.39} 39%|███████████████████████████████▌ | 703/1784 [52:37<54:49, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 704/1784 [52:41<58:04, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 704/1784 [52:41<58:04, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 704/1784 [52:41<58:04, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▊ | 705/1784 [52:44<1:01:20, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:29:55,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:29:55,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0972, 'learning_rate': 8.426791277258569e-06, 'epoch': 0.4} 40%|██████████████████████████████▉ | 707/1784 [52:52<1:03:23, 3.53s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▉ | 707/1784 [52:52<1:03:23, 3.53s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9853, 'learning_rate': 8.419003115264797e-06, 'epoch': 0.4} 40%|██████████████████████████████▉ | 707/1784 [52:52<1:03:23, 3.53s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▉ | 708/1784 [52:55<1:03:46, 3.56s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▉ | 708/1784 [52:55<1:03:46, 3.56s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▉ | 708/1784 [52:55<1:03:46, 3.56s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▉ | 709/1784 [52:59<1:03:46, 3.56s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▉ | 709/1784 [52:59<1:03:46, 3.56s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|██████████████████████████████▉ | 709/1784 [52:59<1:03:46, 3.56s/it]g-point operations will not be computed-28 10:29:38,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████ | 710/1784 [53:02<1:03:14, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:11,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████ | 711/1784 [53:06<1:02:53, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:11,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████ | 711/1784 [53:06<1:02:53, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:11,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2635, 'learning_rate': 8.38785046728972e-06, 'epoch': 0.4} 40%|███████████████████████████████ | 711/1784 [53:06<1:02:53, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:11,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▏ | 712/1784 [53:09<1:02:25, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:11,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▏ | 712/1784 [53:09<1:02:25, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:11,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▏ | 712/1784 [53:09<1:02:25, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:11,457 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▏ | 713/1784 [53:13<1:02:32, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▏ | 714/1784 [53:16<1:02:13, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▏ | 714/1784 [53:16<1:02:13, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5115, 'learning_rate': 8.364485981308411e-06, 'epoch': 0.4} 40%|███████████████████████████████▎ | 715/1784 [53:20<1:02:01, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▎ | 715/1784 [53:20<1:02:01, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1638, 'learning_rate': 8.356697819314642e-06, 'epoch': 0.4} 40%|███████████████████████████████▎ | 715/1784 [53:20<1:02:01, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▎ | 716/1784 [53:23<1:01:45, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:30:33,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:30:33,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1803, 'learning_rate': 8.341121495327103e-06, 'epoch': 0.4} 40%|███████████████████████████████▍ | 718/1784 [53:30<1:01:20, 3.45s/it]g-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▍ | 718/1784 [53:30<1:01:20, 3.45s/it]g-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3886, 'learning_rate': 8.333333333333334e-06, 'epoch': 0.4} 40%|███████████████████████████████▍ | 718/1784 [53:30<1:01:20, 3.45s/it]g-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▍ | 719/1784 [53:33<1:01:17, 3.45s/it]g-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:30:44,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:30:44,153 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2577, 'learning_rate': 8.317757009345795e-06, 'epoch': 0.4} 40%|███████████████████████████████▌ | 721/1784 [53:40<1:00:15, 3.40s/it]g-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|███████████████████████████████▌ | 721/1784 [53:40<1:00:15, 3.40s/it]g-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3291, 'learning_rate': 8.309968847352025e-06, 'epoch': 0.4} 40%|███████████████████████████████▌ | 721/1784 [53:40<1:00:15, 3.40s/it]g-point operations will not be computed-28 10:30:21,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 722/1784 [53:43<59:49, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▍ | 723/1784 [53:47<59:19, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▍ | 723/1784 [53:47<59:19, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1644, 'learning_rate': 8.294392523364487e-06, 'epoch': 0.41} 41%|████████████████████████████████▍ | 723/1784 [53:47<59:19, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▍ | 724/1784 [53:50<58:41, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:00,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:00,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2612, 'learning_rate': 8.278816199376948e-06, 'epoch': 0.41} [WARNING|modeling_utils.py:388] 2022-02-28 10:31:00,538 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▌ | 726/1784 [53:56<57:21, 3.25s/it]g-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:06,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:06,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8837, 'learning_rate': 8.263239875389409e-06, 'epoch': 0.41} [WARNING|modeling_utils.py:388] 2022-02-28 10:31:06,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 728/1784 [54:03<56:31, 3.21s/it]g-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 728/1784 [54:03<56:31, 3.21s/it]g-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 728/1784 [54:03<56:31, 3.21s/it]g-point operations will not be computed-28 10:30:52,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 729/1784 [54:06<56:33, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 730/1784 [54:09<56:19, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▋ | 730/1784 [54:09<56:19, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1302, 'learning_rate': 8.2398753894081e-06, 'epoch': 0.41} 41%|████████████████████████████████▋ | 730/1784 [54:09<56:19, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▊ | 731/1784 [54:12<55:46, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:22,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:22,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2175, 'learning_rate': 8.224299065420562e-06, 'epoch': 0.41} [WARNING|modeling_utils.py:388] 2022-02-28 10:31:22,694 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:14,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▊ | 733/1784 [54:18<54:29, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:27,256 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 734/1784 [54:21<53:41, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:27,256 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 734/1784 [54:21<53:41, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:27,256 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2237, 'learning_rate': 8.208722741433023e-06, 'epoch': 0.41} 41%|████████████████████████████████▉ | 734/1784 [54:21<53:41, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:27,256 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|████████████████████████████████▉ | 735/1784 [54:24<52:56, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:33,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 736/1784 [54:27<52:06, 2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:33,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 736/1784 [54:27<52:06, 2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:33,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:37,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:33,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:37,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:33,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.141, 'learning_rate': 8.185358255451715e-06, 'epoch': 0.41} [WARNING|modeling_utils.py:388] 2022-02-28 10:31:37,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:33,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 738/1784 [54:33<49:50, 2.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:41,368 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▏ | 739/1784 [54:35<48:48, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:41,368 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▏ | 739/1784 [54:35<48:48, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:41,368 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1081, 'learning_rate': 8.169781931464174e-06, 'epoch': 0.41} 41%|█████████████████████████████████▏ | 739/1784 [54:35<48:48, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:41,368 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▏ | 740/1784 [54:38<47:48, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:46,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▏ | 740/1784 [54:38<47:48, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:46,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▏ | 741/1784 [54:40<46:28, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:49,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▏ | 741/1784 [54:40<46:28, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:49,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▎ | 742/1784 [54:43<45:05, 2.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▎ | 742/1784 [54:43<45:05, 2.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▎ | 743/1784 [54:45<43:15, 2.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▎ | 743/1784 [54:45<43:15, 2.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:54,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:54,637 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:56,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:56,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:58,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:31:58,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:01,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:01,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6986, 'learning_rate': 8.099688473520249e-06, 'epoch': 0.42} [WARNING|modeling_utils.py:388] 2022-02-28 10:32:02,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:02,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:04,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:04,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:04,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:08,280 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:08,280 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:08,280 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▋ | 752/1784 [55:05<47:35, 2.77s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:15,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:15,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2551, 'learning_rate': 8.060747663551402e-06, 'epoch': 0.42} 42%|█████████████████████████████████▊ | 754/1784 [55:12<56:14, 3.28s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 754/1784 [55:12<56:14, 3.28s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9347, 'learning_rate': 8.052959501557634e-06, 'epoch': 0.42} 42%|█████████████████████████████████▊ | 755/1784 [55:16<58:19, 3.40s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 755/1784 [55:16<58:19, 3.40s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0872, 'learning_rate': 8.045171339563863e-06, 'epoch': 0.42} 42%|█████████████████████████████████▉ | 756/1784 [55:20<59:16, 3.46s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 756/1784 [55:20<59:16, 3.46s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2794, 'learning_rate': 8.037383177570094e-06, 'epoch': 0.42} 42%|█████████████████████████████████▉ | 756/1784 [55:20<59:16, 3.46s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▉ | 757/1784 [55:23<59:58, 3.50s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:34,019 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:34,019 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1948, 'learning_rate': 8.021806853582555e-06, 'epoch': 0.42} 43%|█████████████████████████████████▏ | 759/1784 [55:30<1:00:24, 3.54s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|█████████████████████████████████▏ | 759/1784 [55:30<1:00:24, 3.54s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.248, 'learning_rate': 8.014018691588785e-06, 'epoch': 0.43} 43%|█████████████████████████████████▏ | 759/1784 [55:30<1:00:24, 3.54s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|█████████████████████████████████▏ | 760/1784 [55:34<1:00:30, 3.55s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:44,686 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:44,686 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4344, 'learning_rate': 7.998442367601246e-06, 'epoch': 0.43} 43%|█████████████████████████████████▎ | 762/1784 [55:41<1:00:02, 3.52s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|█████████████████████████████████▎ | 762/1784 [55:41<1:00:02, 3.52s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9615, 'learning_rate': 7.990654205607477e-06, 'epoch': 0.43} 43%|██████████████████████████████████▏ | 763/1784 [55:44<59:48, 3.51s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▏ | 763/1784 [55:44<59:48, 3.51s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2202, 'learning_rate': 7.982866043613708e-06, 'epoch': 0.43} 43%|██████████████████████████████████▏ | 763/1784 [55:44<59:48, 3.51s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|█████████████████████████████████▍ | 764/1784 [55:48<1:00:04, 3.53s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:58,695 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:32:58,695 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.93, 'learning_rate': 7.967289719626169e-06, 'epoch': 0.43} 43%|██████████████████████████████████▎ | 766/1784 [55:55<59:05, 3.48s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▎ | 766/1784 [55:55<59:05, 3.48s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0406, 'learning_rate': 7.9595015576324e-06, 'epoch': 0.43} 43%|██████████████████████████████████▎ | 766/1784 [55:55<59:05, 3.48s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▍ | 767/1784 [55:58<58:31, 3.45s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:08,918 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:08,918 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9428, 'learning_rate': 7.94392523364486e-06, 'epoch': 0.43} 43%|██████████████████████████████████▍ | 769/1784 [56:05<58:00, 3.43s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▍ | 769/1784 [56:05<58:00, 3.43s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4859, 'learning_rate': 7.936137071651091e-06, 'epoch': 0.43} 43%|██████████████████████████████████▍ | 769/1784 [56:05<58:00, 3.43s/it]g-point operations will not be computed-28 10:31:51,378 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 770/1784 [56:08<57:32, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 771/1784 [56:12<57:13, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 771/1784 [56:12<57:13, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0879, 'learning_rate': 7.920560747663552e-06, 'epoch': 0.43} 43%|██████████████████████████████████▌ | 772/1784 [56:15<57:03, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 772/1784 [56:15<57:03, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:25,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:25,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0768, 'learning_rate': 7.904984423676013e-06, 'epoch': 0.43} 43%|██████████████████████████████████▋ | 774/1784 [56:22<56:06, 3.33s/it]g-point operations will not be computed-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 774/1784 [56:22<56:06, 3.33s/it]g-point operations will not be computed-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2007, 'learning_rate': 7.897196261682244e-06, 'epoch': 0.43} 43%|██████████████████████████████████▋ | 774/1784 [56:22<56:06, 3.33s/it]g-point operations will not be computed-28 10:33:17,416 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 775/1784 [56:25<55:31, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 776/1784 [56:28<55:02, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▊ | 776/1784 [56:28<55:02, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1272, 'learning_rate': 7.881619937694705e-06, 'epoch': 0.43} 43%|██████████████████████████████████▊ | 776/1784 [56:28<55:02, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|██████████████████████████████████▊ | 777/1784 [56:31<54:40, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:41,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:41,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3929, 'learning_rate': 7.866043613707166e-06, 'epoch': 0.44} [WARNING|modeling_utils.py:388] 2022-02-28 10:33:41,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|██████████████████████████████████▉ | 779/1784 [56:38<53:43, 3.21s/it]g-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:48,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:48,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3425, 'learning_rate': 7.850467289719627e-06, 'epoch': 0.44} 44%|███████████████████████████████████ | 781/1784 [56:44<52:50, 3.16s/it]g-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████ | 781/1784 [56:44<52:50, 3.16s/it]g-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:54,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:33:54,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3224, 'learning_rate': 7.834890965732088e-06, 'epoch': 0.44} 44%|███████████████████████████████████ | 783/1784 [56:50<51:40, 3.10s/it]g-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████ | 783/1784 [56:50<51:40, 3.10s/it]g-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:00,180 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:00,180 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1546, 'learning_rate': 7.81931464174455e-06, 'epoch': 0.44} [WARNING|modeling_utils.py:388] 2022-02-28 10:34:00,180 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:33:33,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▏ | 785/1784 [56:56<49:53, 3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:04,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▏ | 786/1784 [56:59<49:05, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:04,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▏ | 786/1784 [56:59<49:05, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:04,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:08,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:04,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:08,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:04,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8694, 'learning_rate': 7.79595015576324e-06, 'epoch': 0.44} [WARNING|modeling_utils.py:388] 2022-02-28 10:34:08,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:04,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▎ | 788/1784 [57:04<47:07, 2.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 789/1784 [57:07<45:57, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 789/1784 [57:07<45:57, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:16,570 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:16,570 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:19,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:19,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4553, 'learning_rate': 7.764797507788162e-06, 'epoch': 0.44} [WARNING|modeling_utils.py:388] 2022-02-28 10:34:19,024 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:12,755 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 792/1784 [57:14<41:55, 2.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:22,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 792/1784 [57:14<41:55, 2.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:22,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 793/1784 [57:16<40:13, 2.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:24,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▌ | 793/1784 [57:16<40:13, 2.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:24,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▌ | 794/1784 [57:18<38:22, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:26,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▌ | 794/1784 [57:18<38:22, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:26,635 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▋ | 795/1784 [57:20<36:15, 2.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▋ | 795/1784 [57:20<36:15, 2.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:28,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▋ | 797/1784 [57:23<31:30, 1.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:30,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▋ | 797/1784 [57:23<31:30, 1.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:30,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▊ | 798/1784 [57:25<28:55, 1.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:32,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▊ | 798/1784 [57:25<28:55, 1.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:32,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6659, 'learning_rate': 7.710280373831777e-06, 'epoch': 0.45} 45%|███████████████████████████████████▊ | 799/1784 [57:26<26:19, 1.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:34,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▊ | 799/1784 [57:26<26:19, 1.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:34,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▊ | 800/1784 [57:28<26:58, 1.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:34,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 801/1784 [57:32<38:35, 2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:37,304 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 801/1784 [57:32<38:35, 2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:37,304 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 801/1784 [57:32<38:35, 2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 802/1784 [57:36<45:39, 2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|███████████████████████████████████▉ | 802/1784 [57:36<45:39, 2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1487, 'learning_rate': 7.679127725856698e-06, 'epoch': 0.45} 45%|████████████████████████████████████ | 803/1784 [57:39<50:23, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 803/1784 [57:39<50:23, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1774, 'learning_rate': 7.671339563862929e-06, 'epoch': 0.45} 45%|████████████████████████████████████ | 804/1784 [57:43<53:18, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 804/1784 [57:43<53:18, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1412, 'learning_rate': 7.663551401869159e-06, 'epoch': 0.45} 45%|████████████████████████████████████ | 805/1784 [57:47<55:19, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████ | 805/1784 [57:47<55:19, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:57,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:34:57,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2893, 'learning_rate': 7.64797507788162e-06, 'epoch': 0.45} 45%|████████████████████████████████████▏ | 807/1784 [57:54<57:05, 3.51s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 807/1784 [57:54<57:05, 3.51s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1144, 'learning_rate': 7.64018691588785e-06, 'epoch': 0.45} 45%|████████████████████████████████████▏ | 808/1784 [57:58<57:25, 3.53s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 808/1784 [57:58<57:25, 3.53s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8777, 'learning_rate': 7.632398753894081e-06, 'epoch': 0.45} 45%|████████████████████████████████████▎ | 809/1784 [58:01<57:25, 3.53s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 809/1784 [58:01<57:25, 3.53s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:12,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:12,110 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2677, 'learning_rate': 7.616822429906543e-06, 'epoch': 0.45} 45%|████████████████████████████████████▎ | 811/1784 [58:08<57:13, 3.53s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 811/1784 [58:08<57:13, 3.53s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2522, 'learning_rate': 7.609034267912772e-06, 'epoch': 0.45} 46%|████████████████████████████████████▍ | 812/1784 [58:12<56:45, 3.50s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▍ | 812/1784 [58:12<56:45, 3.50s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:22,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:22,507 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1521, 'learning_rate': 7.593457943925234e-06, 'epoch': 0.46} 46%|████████████████████████████████████▌ | 814/1784 [58:19<56:33, 3.50s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 814/1784 [58:19<56:33, 3.50s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.016, 'learning_rate': 7.585669781931465e-06, 'epoch': 0.46} 46%|████████████████████████████████████▌ | 815/1784 [58:22<56:13, 3.48s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 815/1784 [58:22<56:13, 3.48s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1516, 'learning_rate': 7.5778816199376945e-06, 'epoch': 0.46} 46%|████████████████████████████████████▌ | 815/1784 [58:22<56:13, 3.48s/it]g-point operations will not be computed-28 10:34:41,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▌ | 816/1784 [58:26<56:07, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▋ | 817/1784 [58:29<56:01, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▋ | 817/1784 [58:29<56:01, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0589, 'learning_rate': 7.5623052959501565e-06, 'epoch': 0.46} 46%|████████████████████████████████████▋ | 818/1784 [58:33<55:47, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▋ | 818/1784 [58:33<55:47, 3.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:43,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:43,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:43,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1802, 'learning_rate': 7.546728971962617e-06, 'epoch': 0.46} 46%|████████████████████████████████████▊ | 820/1784 [58:39<54:58, 3.42s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▊ | 820/1784 [58:39<54:58, 3.42s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▊ | 820/1784 [58:39<54:58, 3.42s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▊ | 821/1784 [58:43<54:29, 3.40s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▊ | 821/1784 [58:43<54:29, 3.40s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:53,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:53,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:35:53,273 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 823/1784 [58:49<53:38, 3.35s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 823/1784 [58:49<53:38, 3.35s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 824/1784 [58:53<53:16, 3.33s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 824/1784 [58:53<53:16, 3.33s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:03,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:03,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1595, 'learning_rate': 7.500000000000001e-06, 'epoch': 0.46} 46%|█████████████████████████████████████ | 826/1784 [58:59<52:37, 3.30s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 826/1784 [58:59<52:37, 3.30s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:09,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:09,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3047, 'learning_rate': 7.484423676012462e-06, 'epoch': 0.46} [WARNING|modeling_utils.py:388] 2022-02-28 10:36:09,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 828/1784 [59:05<51:54, 3.26s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 828/1784 [59:05<51:54, 3.26s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 828/1784 [59:05<51:54, 3.26s/it]g-point operations will not be computed-28 10:35:34,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 829/1784 [59:09<51:35, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:17,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 829/1784 [59:09<51:35, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:17,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▏ | 830/1784 [59:12<51:07, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:17,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▏ | 830/1784 [59:12<51:07, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:17,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▏ | 830/1784 [59:12<51:07, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:17,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 831/1784 [59:15<50:48, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:23,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 831/1784 [59:15<50:48, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:23,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 832/1784 [59:18<50:04, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:23,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 832/1784 [59:18<50:04, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:23,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 832/1784 [59:18<50:04, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:23,913 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 833/1784 [59:21<49:38, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▎ | 833/1784 [59:21<49:38, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▍ | 834/1784 [59:24<49:10, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▍ | 834/1784 [59:24<49:10, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▍ | 835/1784 [59:27<48:42, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▍ | 835/1784 [59:27<48:42, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:37,420 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:37,420 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:30,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1249, 'learning_rate': 7.4143302180685364e-06, 'epoch': 0.47} 47%|█████████████████████████████████████▌ | 837/1784 [59:33<46:56, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:41,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 837/1784 [59:33<46:56, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:41,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 838/1784 [59:36<46:08, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:41,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▌ | 838/1784 [59:36<46:08, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:41,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:45,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:41,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:45,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:41,737 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0376, 'learning_rate': 7.390965732087229e-06, 'epoch': 0.47} 47%|█████████████████████████████████████▋ | 840/1784 [59:41<44:20, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 840/1784 [59:41<44:20, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 841/1784 [59:44<43:11, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 841/1784 [59:44<43:11, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:53,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:53,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:55,914 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:55,914 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1591, 'learning_rate': 7.35981308411215e-06, 'epoch': 0.47} [WARNING|modeling_utils.py:388] 2022-02-28 10:36:58,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:36:58,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:00,113 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:00,113 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:01,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:01,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:03,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:03,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:06,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:08,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:08,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6893, 'learning_rate': 7.313084112149533e-06, 'epoch': 0.48} {'loss': 4.7372, 'learning_rate': 7.305295950155764e-06, 'epoch': 0.48} [WARNING|modeling_utils.py:388] 2022-02-28 10:37:08,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:12,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:12,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:12,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 852/1784 [1:00:09<43:50, 2.82s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 852/1784 [1:00:09<43:50, 2.82s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 852/1784 [1:00:09<43:50, 2.82s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 853/1784 [1:00:12<48:08, 3.10s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 853/1784 [1:00:12<48:08, 3.10s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 853/1784 [1:00:12<48:08, 3.10s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 854/1784 [1:00:16<50:49, 3.28s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▎ | 854/1784 [1:00:16<50:49, 3.28s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▍ | 855/1784 [1:00:20<52:34, 3.40s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▍ | 855/1784 [1:00:20<52:34, 3.40s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:30,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:30,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0841, 'learning_rate': 7.258566978193147e-06, 'epoch': 0.48} 48%|█████████████████████████████████████▍ | 857/1784 [1:00:27<54:48, 3.55s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▍ | 857/1784 [1:00:27<54:48, 3.55s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1443, 'learning_rate': 7.2507788161993775e-06, 'epoch': 0.48} 48%|█████████████████████████████████████▌ | 858/1784 [1:00:31<55:04, 3.57s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▌ | 858/1784 [1:00:31<55:04, 3.57s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.46, 'learning_rate': 7.242990654205608e-06, 'epoch': 0.48} 48%|█████████████████████████████████████▌ | 859/1784 [1:00:34<55:10, 3.58s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▌ | 859/1784 [1:00:34<55:10, 3.58s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:45,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:45,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:45,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:48,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:48,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8453, 'learning_rate': 7.2196261682243e-06, 'epoch': 0.48} 48%|█████████████████████████████████████▋ | 862/1784 [1:00:45<54:48, 3.57s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▋ | 862/1784 [1:00:45<54:48, 3.57s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0966, 'learning_rate': 7.21183800623053e-06, 'epoch': 0.48} 48%|█████████████████████████████████████▋ | 863/1784 [1:00:49<54:36, 3.56s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▋ | 863/1784 [1:00:49<54:36, 3.56s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:59,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:37:59,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0659, 'learning_rate': 7.196261682242991e-06, 'epoch': 0.48} 48%|█████████████████████████████████████▊ | 865/1784 [1:00:56<53:54, 3.52s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|█████████████████████████████████████▊ | 865/1784 [1:00:56<53:54, 3.52s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0819, 'learning_rate': 7.188473520249222e-06, 'epoch': 0.48} 49%|█████████████████████████████████████▊ | 866/1784 [1:00:59<53:40, 3.51s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|█████████████████████████████████████▊ | 866/1784 [1:00:59<53:40, 3.51s/it]g-point operations will not be computed-28 10:36:49,876 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3268, 'learning_rate': 7.180685358255453e-06, 'epoch': 0.49} 49%|█████████████████████████████████████▉ | 867/1784 [1:01:02<53:24, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:11,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|█████████████████████████████████████▉ | 867/1784 [1:01:02<53:24, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:11,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|█████████████████████████████████████▉ | 868/1784 [1:01:06<53:03, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:11,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|█████████████████████████████████████▉ | 868/1784 [1:01:06<53:03, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:11,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2028, 'learning_rate': 7.165109034267913e-06, 'epoch': 0.49} 49%|█████████████████████████████████████▉ | 869/1784 [1:01:09<52:36, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:11,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|█████████████████████████████████████▉ | 869/1784 [1:01:09<52:36, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:11,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1953, 'learning_rate': 7.1573208722741435e-06, 'epoch': 0.49} 49%|█████████████████████████████████████▉ | 869/1784 [1:01:09<52:36, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:11,556 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████ | 870/1784 [1:01:13<52:22, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████ | 871/1784 [1:01:16<51:58, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████ | 871/1784 [1:01:16<51:58, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1843, 'learning_rate': 7.1417445482866054e-06, 'epoch': 0.49} 49%|██████████████████████████████████████▏ | 872/1784 [1:01:19<51:30, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▏ | 872/1784 [1:01:19<51:30, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:38:30,012 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:38:30,012 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1832, 'learning_rate': 7.126168224299066e-06, 'epoch': 0.49} 49%|██████████████████████████████████████▏ | 874/1784 [1:01:26<50:42, 3.34s/it]g-point operations will not be computed-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▏ | 874/1784 [1:01:26<50:42, 3.34s/it]g-point operations will not be computed-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1114, 'learning_rate': 7.118380062305297e-06, 'epoch': 0.49} 49%|██████████████████████████████████████▏ | 874/1784 [1:01:26<50:42, 3.34s/it]g-point operations will not be computed-28 10:38:21,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▎ | 875/1784 [1:01:29<50:09, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:38,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▎ | 876/1784 [1:01:32<49:42, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:38,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▎ | 876/1784 [1:01:32<49:42, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:38,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1978, 'learning_rate': 7.1028037383177574e-06, 'epoch': 0.49} 49%|██████████████████████████████████████▎ | 877/1784 [1:01:36<49:26, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:38,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▎ | 877/1784 [1:01:36<49:26, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:38,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5599, 'learning_rate': 7.095015576323988e-06, 'epoch': 0.49} 49%|██████████████████████████████████████▎ | 877/1784 [1:01:36<49:26, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:38,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▍ | 878/1784 [1:01:39<49:06, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:47,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▍ | 879/1784 [1:01:42<48:43, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:47,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▍ | 879/1784 [1:01:42<48:43, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:47,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1294, 'learning_rate': 7.07943925233645e-06, 'epoch': 0.49} 49%|██████████████████████████████████████▍ | 879/1784 [1:01:42<48:43, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:47,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▍ | 880/1784 [1:01:45<48:26, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:54,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▌ | 881/1784 [1:01:48<47:58, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:54,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▌ | 881/1784 [1:01:48<47:58, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:54,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3559, 'learning_rate': 7.06386292834891e-06, 'epoch': 0.49} 49%|██████████████████████████████████████▌ | 881/1784 [1:01:48<47:58, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:38:54,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▌ | 882/1784 [1:01:51<47:29, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▌ | 883/1784 [1:01:55<47:13, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|██████████████████████████████████████▌ | 883/1784 [1:01:55<47:13, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:04,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:04,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3173, 'learning_rate': 7.040498442367601e-06, 'epoch': 0.5} 50%|██████████████████████████████████████▋ | 885/1784 [1:02:00<45:30, 3.04s/it]g-point operations will not be computed-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|██████████████████████████████████████▋ | 885/1784 [1:02:00<45:30, 3.04s/it]g-point operations will not be computed-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:10,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:10,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:00,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4091, 'learning_rate': 7.024922118380063e-06, 'epoch': 0.5} 50%|██████████████████████████████████████▊ | 887/1784 [1:02:06<43:48, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:14,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|██████████████████████████████████████▊ | 887/1784 [1:02:06<43:48, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:14,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|██████████████████████████████████████▊ | 888/1784 [1:02:09<42:58, 2.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:14,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|██████████████████████████████████████▊ | 888/1784 [1:02:09<42:58, 2.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:14,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:18,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:14,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:18,834 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:14,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9532, 'learning_rate': 7.001557632398755e-06, 'epoch': 0.5} 50%|██████████████████████████████████████▉ | 890/1784 [1:02:14<40:51, 2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:22,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|██████████████████████████████████████▉ | 890/1784 [1:02:14<40:51, 2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:22,723 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|██████████████████████████████████████▉ | 891/1784 [1:02:17<39:30, 2.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:25,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|██████████████████████████████████████▉ | 891/1784 [1:02:17<39:30, 2.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:25,070 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████ | 892/1784 [1:02:19<37:48, 2.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:27,307 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████ | 892/1784 [1:02:19<37:48, 2.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:27,307 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████ | 893/1784 [1:02:21<36:11, 2.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:29,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████ | 893/1784 [1:02:21<36:11, 2.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:29,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████ | 894/1784 [1:02:23<34:27, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:31,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████ | 894/1784 [1:02:23<34:27, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:31,405 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▏ | 895/1784 [1:02:25<32:29, 2.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:33,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▏ | 895/1784 [1:02:25<32:29, 2.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:33,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2889, 'learning_rate': 6.954828660436138e-06, 'epoch': 0.5} {'loss': 4.082, 'learning_rate': 6.947040498442368e-06, 'epoch': 0.5} 50%|███████████████████████████████████████▏ | 897/1784 [1:02:28<27:58, 1.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:34,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▏ | 897/1784 [1:02:28<27:58, 1.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:34,849 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▎ | 898/1784 [1:02:30<26:02, 1.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:36,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▎ | 898/1784 [1:02:30<26:02, 1.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:36,373 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▎ | 899/1784 [1:02:31<23:43, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:38,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▎ | 899/1784 [1:02:31<23:43, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:38,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▎ | 900/1784 [1:02:33<24:13, 1.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:38,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▎ | 900/1784 [1:02:33<24:13, 1.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|███████████████████████████████████████▎ | 900/1784 [1:02:33<24:13, 1.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▍ | 901/1784 [1:02:37<34:23, 2.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:47,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:39:47,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0334, 'learning_rate': 6.900311526479751e-06, 'epoch': 0.51} [WARNING|modeling_utils.py:388] 2022-02-28 10:39:47,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▍ | 903/1784 [1:02:44<44:56, 3.06s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▍ | 903/1784 [1:02:44<44:56, 3.06s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▍ | 903/1784 [1:02:44<44:56, 3.06s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▌ | 904/1784 [1:02:48<47:47, 3.26s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▌ | 904/1784 [1:02:48<47:47, 3.26s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▌ | 904/1784 [1:02:48<47:47, 3.26s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▌ | 905/1784 [1:02:51<49:16, 3.36s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▌ | 905/1784 [1:02:51<49:16, 3.36s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▌ | 905/1784 [1:02:51<49:16, 3.36s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▌ | 906/1784 [1:02:55<50:15, 3.43s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:40:06,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:40:06,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2303, 'learning_rate': 6.861370716510904e-06, 'epoch': 0.51} [WARNING|modeling_utils.py:388] 2022-02-28 10:40:06,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▋ | 908/1784 [1:03:02<51:19, 3.52s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▋ | 908/1784 [1:03:02<51:19, 3.52s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▋ | 908/1784 [1:03:02<51:19, 3.52s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▋ | 909/1784 [1:03:06<51:31, 3.53s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▋ | 909/1784 [1:03:06<51:31, 3.53s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▋ | 909/1784 [1:03:06<51:31, 3.53s/it]g-point operations will not be computed-28 10:39:42,061 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▊ | 910/1784 [1:03:09<51:20, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:18,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▊ | 911/1784 [1:03:13<51:09, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:18,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▊ | 911/1784 [1:03:13<51:09, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:18,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2121, 'learning_rate': 6.8302180685358264e-06, 'epoch': 0.51} 51%|███████████████████████████████████████▊ | 911/1784 [1:03:13<51:09, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:18,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▊ | 912/1784 [1:03:16<50:50, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:18,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▊ | 912/1784 [1:03:16<50:50, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:18,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▊ | 912/1784 [1:03:16<50:50, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:18,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▉ | 913/1784 [1:03:20<50:45, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▉ | 914/1784 [1:03:23<50:19, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|███████████████████████████████████████▉ | 914/1784 [1:03:23<50:19, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1481, 'learning_rate': 6.806853582554518e-06, 'epoch': 0.51} 51%|███████████████████████████████████████▉ | 914/1784 [1:03:23<50:19, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████ | 915/1784 [1:03:27<49:50, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:40:37,294 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:40:37,294 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.151, 'learning_rate': 6.791277258566978e-06, 'epoch': 0.51} 51%|████████████████████████████████████████ | 917/1784 [1:03:33<49:30, 3.43s/it]g-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████ | 917/1784 [1:03:33<49:30, 3.43s/it]g-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0026, 'learning_rate': 6.783489096573209e-06, 'epoch': 0.51} 51%|████████████████████████████████████████ | 917/1784 [1:03:33<49:30, 3.43s/it]g-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████▏ | 918/1784 [1:03:37<48:59, 3.39s/it]g-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████▏ | 918/1784 [1:03:37<48:59, 3.39s/it]g-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|████████████████████████████████████████▏ | 918/1784 [1:03:37<48:59, 3.39s/it]g-point operations will not be computed-28 10:40:28,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▏ | 919/1784 [1:03:40<48:50, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▏ | 919/1784 [1:03:40<48:50, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▏ | 920/1784 [1:03:43<48:47, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▏ | 920/1784 [1:03:43<48:47, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▏ | 920/1784 [1:03:43<48:47, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▎ | 921/1784 [1:03:47<48:32, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:40:57,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:40:57,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9485, 'learning_rate': 6.744548286604362e-06, 'epoch': 0.52} [WARNING|modeling_utils.py:388] 2022-02-28 10:40:57,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▎ | 923/1784 [1:03:53<47:57, 3.34s/it]g-point operations will not be computed-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▎ | 923/1784 [1:03:53<47:57, 3.34s/it]g-point operations will not be computed-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▎ | 923/1784 [1:03:53<47:57, 3.34s/it]g-point operations will not be computed-28 10:40:49,134 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▍ | 924/1784 [1:03:57<47:42, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▍ | 925/1784 [1:04:00<46:54, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▍ | 925/1784 [1:04:00<46:54, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3378, 'learning_rate': 6.7211838006230535e-06, 'epoch': 0.52} 52%|████████████████████████████████████████▍ | 925/1784 [1:04:00<46:54, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▍ | 926/1784 [1:04:03<46:36, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:13,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:13,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1078, 'learning_rate': 6.705607476635515e-06, 'epoch': 0.52} [WARNING|modeling_utils.py:388] 2022-02-28 10:41:13,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▌ | 928/1784 [1:04:09<46:04, 3.23s/it]g-point operations will not be computed-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▌ | 928/1784 [1:04:09<46:04, 3.23s/it]g-point operations will not be computed-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▌ | 928/1784 [1:04:09<46:04, 3.23s/it]g-point operations will not be computed-28 10:41:05,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▌ | 929/1784 [1:04:13<45:51, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:21,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▌ | 929/1784 [1:04:13<45:51, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:21,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 930/1784 [1:04:16<45:28, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:21,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 930/1784 [1:04:16<45:28, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:21,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 930/1784 [1:04:16<45:28, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:21,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 931/1784 [1:04:19<45:05, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:27,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 931/1784 [1:04:19<45:05, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:27,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 932/1784 [1:04:22<44:45, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:27,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 932/1784 [1:04:22<44:45, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:27,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▋ | 932/1784 [1:04:22<44:45, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:27,891 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▊ | 933/1784 [1:04:25<44:13, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▊ | 933/1784 [1:04:25<44:13, 3.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▊ | 934/1784 [1:04:28<43:24, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:38,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:38,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2833, 'learning_rate': 6.643302180685359e-06, 'epoch': 0.52} [WARNING|modeling_utils.py:388] 2022-02-28 10:41:38,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▉ | 936/1784 [1:04:34<42:21, 3.00s/it]g-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|████████████████████████████████████████▉ | 936/1784 [1:04:34<42:21, 3.00s/it]g-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:44,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:44,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:44,100 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:33,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████ | 938/1784 [1:04:40<41:06, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████ | 938/1784 [1:04:40<41:06, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████ | 939/1784 [1:04:42<40:10, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████ | 939/1784 [1:04:42<40:10, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:52,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:54,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:41:54,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1937, 'learning_rate': 6.596573208722742e-06, 'epoch': 0.53} [WARNING|modeling_utils.py:388] 2022-02-28 10:41:54,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:41:48,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▏ | 942/1784 [1:04:50<36:34, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:58,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▏ | 942/1784 [1:04:50<36:34, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:41:58,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▏ | 943/1784 [1:04:52<34:43, 2.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:00,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▏ | 943/1784 [1:04:52<34:43, 2.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:00,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▎ | 944/1784 [1:04:54<32:39, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:02,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▎ | 944/1784 [1:04:54<32:39, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:02,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▎ | 946/1784 [1:04:58<28:31, 2.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:04,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▎ | 946/1784 [1:04:58<28:31, 2.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:04,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4111, 'learning_rate': 6.557632398753895e-06, 'epoch': 0.53} 53%|█████████████████████████████████████████▍ | 947/1784 [1:04:59<26:22, 1.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:05,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▍ | 947/1784 [1:04:59<26:22, 1.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:05,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▍ | 949/1784 [1:05:02<22:00, 1.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:08,436 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▍ | 949/1784 [1:05:02<22:00, 1.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:08,436 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 950/1784 [1:05:03<22:28, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:09,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 950/1784 [1:05:03<22:28, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:09,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 950/1784 [1:05:03<22:28, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:12,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 951/1784 [1:05:07<32:12, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:12,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 951/1784 [1:05:07<32:12, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:12,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 951/1784 [1:05:07<32:12, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 951/1784 [1:05:07<32:12, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 952/1784 [1:05:11<38:21, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 952/1784 [1:05:11<38:21, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▌ | 952/1784 [1:05:11<38:21, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▋ | 953/1784 [1:05:15<42:11, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▋ | 953/1784 [1:05:15<42:11, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▋ | 953/1784 [1:05:15<42:11, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▋ | 954/1784 [1:05:19<44:49, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▋ | 954/1784 [1:05:19<44:49, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|█████████████████████████████████████████▋ | 954/1784 [1:05:19<44:49, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:16,609 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▊ | 955/1784 [1:05:22<46:35, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▊ | 956/1784 [1:05:26<47:37, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▊ | 956/1784 [1:05:26<47:37, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0881, 'learning_rate': 6.479750778816199e-06, 'epoch': 0.54} 54%|█████████████████████████████████████████▊ | 956/1784 [1:05:26<47:37, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▊ | 957/1784 [1:05:30<48:33, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▊ | 957/1784 [1:05:30<48:33, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▊ | 957/1784 [1:05:30<48:33, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▉ | 958/1784 [1:05:33<49:02, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▉ | 958/1784 [1:05:33<49:02, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▉ | 958/1784 [1:05:33<49:02, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:31,419 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▉ | 959/1784 [1:05:37<49:19, 3.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▉ | 960/1784 [1:05:40<49:24, 3.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|█████████████████████████████████████████▉ | 960/1784 [1:05:40<49:24, 3.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1104, 'learning_rate': 6.448598130841122e-06, 'epoch': 0.54} 54%|█████████████████████████████████████████▉ | 960/1784 [1:05:40<49:24, 3.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████ | 961/1784 [1:05:44<49:10, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████ | 961/1784 [1:05:44<49:10, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████ | 961/1784 [1:05:44<49:10, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████ | 962/1784 [1:05:48<48:51, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████ | 962/1784 [1:05:48<48:51, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████ | 962/1784 [1:05:48<48:51, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████ | 963/1784 [1:05:51<48:36, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:01,839 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:01,839 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1059, 'learning_rate': 6.417445482866044e-06, 'epoch': 0.54} [WARNING|modeling_utils.py:388] 2022-02-28 10:43:01,839 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▏ | 965/1784 [1:05:58<47:47, 3.50s/it]g-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▏ | 965/1784 [1:05:58<47:47, 3.50s/it]g-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▏ | 965/1784 [1:05:58<47:47, 3.50s/it]g-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▏ | 966/1784 [1:06:01<47:22, 3.48s/it]g-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:12,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:12,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1785, 'learning_rate': 6.3940809968847365e-06, 'epoch': 0.54} [WARNING|modeling_utils.py:388] 2022-02-28 10:43:12,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▎ | 968/1784 [1:06:08<46:59, 3.46s/it]g-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▎ | 968/1784 [1:06:08<46:59, 3.46s/it]g-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▎ | 968/1784 [1:06:08<46:59, 3.46s/it]g-point operations will not be computed-28 10:42:46,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▎ | 969/1784 [1:06:12<46:45, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▍ | 970/1784 [1:06:15<46:41, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▍ | 970/1784 [1:06:15<46:41, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0963, 'learning_rate': 6.370716510903427e-06, 'epoch': 0.54} 54%|██████████████████████████████████████████▍ | 970/1784 [1:06:15<46:41, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|██████████████████████████████████████████▍ | 971/1784 [1:06:19<46:28, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:29,168 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:29,168 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0685, 'learning_rate': 6.355140186915888e-06, 'epoch': 0.54} 55%|██████████████████████████████████████████▌ | 973/1784 [1:06:25<45:37, 3.37s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▌ | 973/1784 [1:06:25<45:37, 3.37s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2269, 'learning_rate': 6.347352024922119e-06, 'epoch': 0.55} 55%|██████████████████████████████████████████▌ | 973/1784 [1:06:25<45:37, 3.37s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▌ | 974/1784 [1:06:28<45:18, 3.36s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:39,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:39,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2635, 'learning_rate': 6.33177570093458e-06, 'epoch': 0.55} [WARNING|modeling_utils.py:388] 2022-02-28 10:43:39,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▋ | 976/1784 [1:06:35<44:36, 3.31s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▋ | 976/1784 [1:06:35<44:36, 3.31s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▋ | 976/1784 [1:06:35<44:36, 3.31s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▋ | 977/1784 [1:06:38<44:15, 3.29s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:48,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:48,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1115, 'learning_rate': 6.308411214953272e-06, 'epoch': 0.55} [WARNING|modeling_utils.py:388] 2022-02-28 10:43:48,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▊ | 979/1784 [1:06:45<43:37, 3.25s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:43:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0471, 'learning_rate': 6.292834890965732e-06, 'epoch': 0.55} [WARNING|modeling_utils.py:388] 2022-02-28 10:43:55,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▉ | 981/1784 [1:06:51<42:39, 3.19s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:01,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:01,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.907, 'learning_rate': 6.277258566978194e-06, 'epoch': 0.55} [WARNING|modeling_utils.py:388] 2022-02-28 10:44:01,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|██████████████████████████████████████████▉ | 983/1784 [1:06:57<41:27, 3.11s/it]g-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:07,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:07,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2423, 'learning_rate': 6.2616822429906544e-06, 'epoch': 0.55} [WARNING|modeling_utils.py:388] 2022-02-28 10:44:07,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:43:20,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|███████████████████████████████████████████ | 985/1784 [1:07:03<40:03, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:44:11,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|███████████████████████████████████████████ | 986/1784 [1:07:06<39:23, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:44:11,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|███████████████████████████████████████████ | 986/1784 [1:07:06<39:23, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:44:11,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:15,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:11,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:15,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:11,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.203, 'learning_rate': 6.238317757009347e-06, 'epoch': 0.55} [WARNING|modeling_utils.py:388] 2022-02-28 10:44:15,795 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:11,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|███████████████████████████████████████████▏ | 988/1784 [1:07:11<37:33, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|███████████████████████████████████████████▏ | 988/1784 [1:07:11<37:33, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|███████████████████████████████████████████▏ | 989/1784 [1:07:14<36:24, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:23,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:23,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:25,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:25,730 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:27,878 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:27,878 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:29,860 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:29,860 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:31,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:31,731 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:33,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:33,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1912, 'learning_rate': 6.176012461059191e-06, 'epoch': 0.56} [WARNING|modeling_utils.py:388] 2022-02-28 10:44:35,012 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:35,012 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:37,754 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:37,754 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.877, 'learning_rate': 6.152647975077882e-06, 'epoch': 0.56} [WARNING|modeling_utils.py:388] 2022-02-28 10:44:38,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 10:44:38,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2369] 2022-02-28 10:44:40,947 >> Batch size = 8aluation *****e number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 0/331 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 2/331 [00:02<06:44, 1.23s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 3/331 [00:04<09:14, 1.69s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 4/331 [00:07<10:36, 1.95s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 5/331 [00:09<12:13, 2.25s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 6/331 [00:12<13:16, 2.45s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 7/331 [00:15<13:24, 2.48s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|██ | 8/331 [00:18<13:44, 2.55s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 9/331 [00:21<14:20, 2.67s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 10/331 [00:24<15:15, 2.85s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 11/331 [00:26<14:40, 2.75s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 12/331 [00:29<14:32, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 13/331 [00:32<14:16, 2.69s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 14/331 [00:34<13:57, 2.64s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 15/331 [00:38<15:15, 2.90s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 16/331 [00:41<16:12, 3.09s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 17/331 [00:44<16:15, 3.11s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▍ | 18/331 [00:47<14:49, 2.84s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 19/331 [00:49<14:36, 2.81s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 20/331 [00:52<13:40, 2.64s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 21/331 [00:55<14:17, 2.77s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 22/331 [00:58<15:17, 2.97s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 23/331 [01:02<16:32, 3.22s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 24/331 [01:06<17:34, 3.43s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 25/331 [01:09<16:52, 3.31s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 26/331 [01:11<15:38, 3.08s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 27/331 [01:15<15:51, 3.13s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 28/331 [01:17<15:26, 3.06s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 29/331 [01:20<15:06, 3.00s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 30/331 [01:23<14:20, 2.86s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▋ | 31/331 [01:25<13:43, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 32/331 [01:28<13:20, 2.68s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 33/331 [01:31<13:28, 2.71s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▍ | 34/331 [01:33<13:28, 2.72s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 35/331 [01:36<13:43, 2.78s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 36/331 [01:40<14:21, 2.92s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 37/331 [01:43<15:07, 3.09s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▍ | 38/331 [01:46<15:18, 3.14s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 39/331 [01:49<15:22, 3.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 40/331 [01:52<14:06, 2.91s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|██████████▏ | 41/331 [01:54<13:32, 2.80s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 42/331 [01:58<14:21, 2.98s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 43/331 [02:01<15:02, 3.13s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▉ | 44/331 [02:05<15:22, 3.22s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 45/331 [02:07<14:34, 3.06s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 46/331 [02:10<13:34, 2.86s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▋ | 47/331 [02:12<12:47, 2.70s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 48/331 [02:15<13:02, 2.76s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▏ | 49/331 [02:18<13:34, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▍ | 50/331 [02:21<13:33, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▋ | 51/331 [02:24<13:45, 2.95s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 52/331 [02:27<13:17, 2.86s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▏ | 53/331 [02:30<13:18, 2.87s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▍ | 54/331 [02:32<12:37, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 55/331 [02:36<13:37, 2.96s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 56/331 [02:38<13:24, 2.92s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 57/331 [02:41<12:56, 2.83s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 58/331 [02:44<13:22, 2.94s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 59/331 [02:47<12:40, 2.79s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▊ | 60/331 [02:49<12:26, 2.76s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|███████████████ | 61/331 [02:53<12:54, 2.87s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 62/331 [02:55<12:51, 2.87s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 63/331 [02:59<14:01, 3.14s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▊ | 64/331 [03:02<13:33, 3.05s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 65/331 [03:05<13:21, 3.01s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▎ | 66/331 [03:09<14:28, 3.28s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▌ | 67/331 [03:13<15:00, 3.41s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 68/331 [03:16<15:08, 3.45s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████ | 69/331 [03:19<14:43, 3.37s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████▎ | 70/331 [03:22<14:25, 3.31s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████▌ | 71/331 [03:26<14:34, 3.36s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 72/331 [03:29<14:27, 3.35s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|██████████████████ | 73/331 [03:32<13:57, 3.24s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|██████████████████▎ | 74/331 [03:35<13:42, 3.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 75/331 [03:39<13:49, 3.24s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▊ | 76/331 [03:41<13:11, 3.10s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|███████████████████ | 77/331 [03:44<12:53, 3.05s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▎ | 78/331 [03:47<12:22, 2.93s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▌ | 79/331 [03:50<11:52, 2.83s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▊ | 80/331 [03:52<11:42, 2.80s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|████████████████████ | 81/331 [03:56<12:09, 2.92s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▎ | 82/331 [03:58<11:52, 2.86s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▌ | 83/331 [04:02<12:19, 2.98s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▊ | 84/331 [04:05<13:07, 3.19s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 85/331 [04:08<12:15, 2.99s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▎ | 86/331 [04:11<12:53, 3.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▌ | 87/331 [04:14<12:25, 3.05s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 88/331 [04:17<12:06, 2.99s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 89/331 [04:19<11:20, 2.81s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████▎ | 90/331 [04:22<10:51, 2.70s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████▌ | 91/331 [04:25<11:17, 2.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 92/331 [04:27<10:28, 2.63s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████ | 93/331 [04:30<10:34, 2.67s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▎ | 94/331 [04:33<10:50, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▌ | 95/331 [04:36<11:00, 2.80s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▊ | 96/331 [04:39<11:03, 2.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 97/331 [04:41<10:40, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 98/331 [04:44<11:02, 2.84s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 99/331 [04:47<10:52, 2.81s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▍ | 100/331 [04:49<10:28, 2.72s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 101/331 [04:52<10:23, 2.71s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 102/331 [04:56<11:13, 2.94s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▏ | 103/331 [04:58<10:43, 2.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▍ | 104/331 [05:01<10:41, 2.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▋ | 105/331 [05:04<10:40, 2.83s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 106/331 [05:07<10:38, 2.84s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|██████████████████████████▏ | 107/331 [05:09<10:00, 2.68s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▍ | 108/331 [05:11<09:45, 2.62s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 109/331 [05:14<09:39, 2.61s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▉ | 110/331 [05:17<10:02, 2.73s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 111/331 [05:20<10:08, 2.77s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▍ | 112/331 [05:23<10:08, 2.78s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 113/331 [05:25<09:40, 2.66s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▉ | 114/331 [05:28<09:49, 2.72s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 115/331 [05:31<09:51, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 116/331 [05:34<10:12, 2.85s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 117/331 [05:37<10:05, 2.83s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 118/331 [05:39<09:45, 2.75s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 119/331 [05:42<09:43, 2.75s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 120/331 [05:45<09:41, 2.75s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 121/331 [05:48<10:13, 2.92s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 122/331 [05:51<10:02, 2.88s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 123/331 [05:54<10:41, 3.08s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▎ | 124/331 [05:57<10:35, 3.07s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 125/331 [06:01<11:05, 3.23s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 126/331 [06:04<11:06, 3.25s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 127/331 [06:08<11:26, 3.37s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 128/331 [06:11<11:23, 3.37s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 129/331 [06:15<11:09, 3.32s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 130/331 [06:18<11:13, 3.35s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 131/331 [06:22<11:23, 3.42s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 132/331 [06:24<10:46, 3.25s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 133/331 [06:27<10:08, 3.07s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 134/331 [06:30<09:52, 3.01s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 135/331 [06:33<10:00, 3.07s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 136/331 [06:37<10:18, 3.17s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 137/331 [06:40<10:35, 3.28s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 138/331 [06:44<10:44, 3.34s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 139/331 [06:46<09:35, 3.00s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 140/331 [06:50<10:17, 3.23s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 141/331 [06:52<09:48, 3.10s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 142/331 [06:55<09:26, 3.00s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 143/331 [06:58<09:47, 3.12s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▏ | 144/331 [07:01<09:23, 3.01s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 145/331 [07:04<09:16, 2.99s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 146/331 [07:08<09:44, 3.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 147/331 [07:11<09:28, 3.09s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 148/331 [07:13<08:51, 2.90s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 149/331 [07:16<08:25, 2.78s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▋ | 150/331 [07:19<08:50, 2.93s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 151/331 [07:22<08:38, 2.88s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 152/331 [07:24<08:10, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 153/331 [07:27<08:07, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 154/331 [07:30<08:31, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 155/331 [07:33<08:55, 3.04s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 156/331 [07:37<09:08, 3.14s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 157/331 [07:40<09:30, 3.28s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 158/331 [07:44<09:32, 3.31s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▉ | 159/331 [07:47<09:34, 3.34s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▏ | 160/331 [07:50<09:04, 3.18s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▍ | 161/331 [07:53<08:51, 3.12s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▋ | 162/331 [07:57<09:11, 3.26s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 163/331 [08:00<09:18, 3.33s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▏ | 164/331 [08:03<08:46, 3.15s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 165/331 [08:06<08:34, 3.10s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 166/331 [08:09<08:20, 3.03s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▊ | 167/331 [08:12<08:33, 3.13s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 168/331 [08:15<08:03, 2.97s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 169/331 [08:18<08:07, 3.01s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 170/331 [08:20<07:44, 2.88s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 171/331 [08:23<07:40, 2.88s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 172/331 [08:26<07:19, 2.76s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 173/331 [08:29<07:29, 2.85s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 174/331 [08:31<07:09, 2.73s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 175/331 [08:34<07:17, 2.80s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 176/331 [08:37<07:04, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 177/331 [08:40<07:26, 2.90s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 178/331 [08:44<07:51, 3.08s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▊ | 179/331 [08:47<08:10, 3.23s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 180/331 [08:50<08:03, 3.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▎ | 181/331 [08:53<08:00, 3.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 182/331 [08:56<07:21, 2.96s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 183/331 [08:58<06:51, 2.78s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 184/331 [09:01<06:28, 2.64s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 185/331 [09:03<06:04, 2.49s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 186/331 [09:05<06:13, 2.58s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▊ | 187/331 [09:09<06:40, 2.78s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 188/331 [09:11<06:35, 2.77s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 189/331 [09:14<06:14, 2.64s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▍ | 190/331 [09:16<06:02, 2.57s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 191/331 [09:19<05:56, 2.55s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 192/331 [09:21<05:49, 2.52s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 193/331 [09:24<06:20, 2.76s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 194/331 [09:27<06:02, 2.65s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 195/331 [09:29<05:53, 2.60s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 196/331 [09:32<05:57, 2.65s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 197/331 [09:35<06:10, 2.77s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 198/331 [09:37<05:51, 2.64s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 199/331 [09:40<05:53, 2.68s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▉ | 200/331 [09:42<05:33, 2.55s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 201/331 [09:45<05:27, 2.52s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 202/331 [09:48<05:37, 2.62s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 203/331 [09:50<05:39, 2.65s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 204/331 [09:54<06:03, 2.86s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 205/331 [09:57<06:03, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 206/331 [10:00<05:56, 2.85s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 207/331 [10:03<06:11, 2.99s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 208/331 [10:06<06:13, 3.04s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 209/331 [10:08<05:43, 2.81s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 210/331 [10:11<05:21, 2.66s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 211/331 [10:13<05:23, 2.70s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 212/331 [10:16<05:11, 2.62s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 213/331 [10:18<05:09, 2.63s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 214/331 [10:21<04:53, 2.51s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 215/331 [10:23<04:42, 2.44s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 216/331 [10:26<05:13, 2.72s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████ | 217/331 [10:29<05:11, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 218/331 [10:32<05:26, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▌ | 219/331 [10:35<05:18, 2.84s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 220/331 [10:38<05:07, 2.77s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 221/331 [10:41<05:10, 2.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▎ | 222/331 [10:43<04:56, 2.72s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 223/331 [10:46<04:59, 2.77s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 224/331 [10:49<04:58, 2.79s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 225/331 [10:52<04:57, 2.80s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 226/331 [10:55<05:09, 2.95s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 227/331 [10:58<05:00, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 228/331 [11:00<04:49, 2.81s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 229/331 [11:03<04:43, 2.78s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▎ | 230/331 [11:06<04:34, 2.72s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▌ | 231/331 [11:09<04:43, 2.83s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 232/331 [11:12<04:39, 2.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 233/331 [11:15<04:46, 2.93s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 234/331 [11:17<04:30, 2.79s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 235/331 [11:20<04:20, 2.71s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 236/331 [11:23<04:44, 3.00s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 237/331 [11:27<04:57, 3.17s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 238/331 [11:30<04:55, 3.17s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 239/331 [11:33<04:54, 3.21s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▋ | 240/331 [11:37<04:57, 3.27s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 241/331 [11:40<05:03, 3.37s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 242/331 [11:44<04:59, 3.36s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▍ | 243/331 [11:47<04:54, 3.35s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 244/331 [11:51<04:57, 3.42s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▉ | 245/331 [11:54<04:43, 3.30s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 246/331 [11:58<04:54, 3.47s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 247/331 [12:01<04:45, 3.40s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 248/331 [12:03<04:23, 3.17s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 249/331 [12:06<04:02, 2.96s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▏ | 250/331 [12:09<03:51, 2.85s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▍ | 251/331 [12:12<03:51, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▋ | 252/331 [12:14<03:37, 2.76s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▉ | 253/331 [12:17<03:45, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▏ | 254/331 [12:20<03:39, 2.85s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▍ | 255/331 [12:23<03:45, 2.96s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▋ | 256/331 [12:26<03:37, 2.90s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|██████████████████████████████████████████████████████████████▉ | 257/331 [12:29<03:43, 3.02s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████▏ | 258/331 [12:32<03:27, 2.84s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████▍ | 259/331 [12:34<03:22, 2.81s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|███████████████████████████████████████████████████████████████▋ | 260/331 [12:37<03:23, 2.86s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|███████████████████████████████████████████████████████████████▊ | 261/331 [12:40<03:09, 2.70s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████████ | 262/331 [12:42<03:07, 2.72s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████████▎ | 263/331 [12:46<03:16, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▌ | 264/331 [12:48<03:08, 2.81s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▊ | 265/331 [12:51<03:00, 2.74s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████████ | 266/331 [12:54<02:54, 2.69s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▎ | 267/331 [12:57<03:05, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▌ | 268/331 [13:00<03:02, 2.90s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▊ | 269/331 [13:03<03:11, 3.08s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████ | 270/331 [13:06<03:07, 3.07s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▎ | 271/331 [13:10<03:09, 3.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▌ | 272/331 [13:13<03:00, 3.06s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▊ | 273/331 [13:16<02:57, 3.07s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████ | 274/331 [13:19<03:03, 3.22s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████▎ | 275/331 [13:23<03:03, 3.27s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████▌ | 276/331 [13:25<02:49, 3.08s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|███████████████████████████████████████████████████████████████████▊ | 277/331 [13:28<02:41, 2.99s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████████ | 278/331 [13:31<02:36, 2.95s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████████▎ | 279/331 [13:35<02:45, 3.19s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▌ | 280/331 [13:38<02:39, 3.13s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▊ | 281/331 [13:41<02:41, 3.22s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████████ | 282/331 [13:44<02:38, 3.23s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████████▎ | 283/331 [13:48<02:38, 3.31s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▍ | 284/331 [13:51<02:39, 3.39s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▋ | 285/331 [13:55<02:38, 3.44s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▉ | 286/331 [13:58<02:36, 3.47s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▏ | 287/331 [14:02<02:37, 3.59s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▍ | 288/331 [14:06<02:32, 3.54s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▋ | 289/331 [14:09<02:19, 3.31s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|██████████████████████████████████████████████████████████████████████▉ | 290/331 [14:11<02:07, 3.11s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████▏ | 291/331 [14:14<01:58, 2.96s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████▍ | 292/331 [14:16<01:52, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|███████████████████████████████████████████████████████████████████████▋ | 293/331 [14:19<01:49, 2.88s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|███████████████████████████████████████████████████████████████████████▉ | 294/331 [14:22<01:42, 2.78s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████████▏ | 295/331 [14:24<01:37, 2.71s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████████▍ | 296/331 [14:27<01:32, 2.64s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▋ | 297/331 [14:31<01:40, 2.95s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▉ | 298/331 [14:34<01:45, 3.19s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████████▏ | 299/331 [14:37<01:38, 3.07s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▍ | 300/331 [14:40<01:33, 3.02s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▋ | 301/331 [14:43<01:28, 2.94s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▉ | 302/331 [14:46<01:23, 2.89s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▏ | 303/331 [14:48<01:18, 2.80s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▍ | 304/331 [14:51<01:18, 2.91s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▋ | 305/331 [14:55<01:18, 3.01s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▉ | 306/331 [14:58<01:20, 3.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▏ | 307/331 [15:02<01:20, 3.36s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▎ | 308/331 [15:06<01:21, 3.53s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▌ | 309/331 [15:09<01:18, 3.55s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|███████████████████████████████████████████████████████████████████████████▊ | 310/331 [15:12<01:09, 3.30s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████████ | 311/331 [15:15<01:05, 3.29s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████████▎ | 312/331 [15:18<00:58, 3.10s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▌ | 313/331 [15:21<00:54, 3.03s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▊ | 314/331 [15:24<00:52, 3.06s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████████ | 315/331 [15:27<00:50, 3.15s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████████▎ | 316/331 [15:31<00:47, 3.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████████▌ | 317/331 [15:34<00:46, 3.31s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████████▊ | 318/331 [15:37<00:40, 3.14s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████████ | 319/331 [15:40<00:36, 3.03s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▎ | 320/331 [15:43<00:33, 3.04s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▌ | 321/331 [15:46<00:30, 3.01s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▊ | 322/331 [15:49<00:28, 3.14s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████ | 323/331 [15:52<00:24, 3.05s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▎ | 324/331 [15:56<00:22, 3.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▌ | 325/331 [15:59<00:19, 3.18s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▊ | 326/331 [16:02<00:16, 3.23s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████ | 327/331 [16:05<00:12, 3.24s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████▎| 328/331 [16:09<00:09, 3.28s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████▌| 329/331 [16:12<00:06, 3.21s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████████▊| 330/331 [16:15<00:03, 3.36s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████████▊| 330/331 [16:15<00:03, 3.36s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 02/28/2022 11:01:01 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow [INFO|configuration_utils.py:438] 2022-02-28 11:01:01,704 >> Configuration saved in ./checkpoint-1000/config.json g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:01:06,806 >> Configuration saved in ./checkpoint-1000/preprocessor_config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:01:06,806 >> Configuration saved in ./checkpoint-1000/preprocessor_config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:01:06,806 >> Configuration saved in ./checkpoint-1000/preprocessor_config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:01:06,806 >> Configuration saved in ./checkpoint-1000/preprocessor_config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|████████████████████████████████████████▉ | 1001/1784 [1:26:07<72:52:35, 335.06s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|████████████████████████████████████████▉ | 1001/1784 [1:26:07<72:52:35, 335.06s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3725, 'learning_rate': 6.129283489096573e-06, 'epoch': 0.56} 56%|████████████████████████████████████████▉ | 1001/1784 [1:26:07<72:52:35, 335.06s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████ | 1002/1784 [1:26:10<51:11:49, 235.69s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████ | 1002/1784 [1:26:10<51:11:49, 235.69s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████ | 1002/1784 [1:26:10<51:11:49, 235.69s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████ | 1003/1784 [1:26:14<36:02:35, 166.14s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████ | 1003/1784 [1:26:14<36:02:35, 166.14s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████ | 1003/1784 [1:26:14<36:02:35, 166.14s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████ | 1004/1784 [1:26:18<25:26:54, 117.45s/it]config.jsonrations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:03:29,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:03:29,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.184, 'learning_rate': 6.0981308411214955e-06, 'epoch': 0.56} 56%|█████████████████████████████████████████▋ | 1006/1784 [1:26:26<12:51:21, 59.49s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████▋ | 1006/1784 [1:26:26<12:51:21, 59.49s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2282, 'learning_rate': 6.090342679127726e-06, 'epoch': 0.56} 56%|██████████████████████████████████████████▎ | 1007/1784 [1:26:29<9:13:33, 42.75s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|██████████████████████████████████████████▎ | 1007/1784 [1:26:29<9:13:33, 42.75s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0185, 'learning_rate': 6.0825545171339575e-06, 'epoch': 0.56} 57%|██████████████████████████████████████████▍ | 1008/1784 [1:26:33<6:41:28, 31.04s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▍ | 1008/1784 [1:26:33<6:41:28, 31.04s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2411, 'learning_rate': 6.074766355140187e-06, 'epoch': 0.57} 57%|██████████████████████████████████████████▍ | 1008/1784 [1:26:33<6:41:28, 31.04s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▍ | 1009/1784 [1:26:37<4:54:44, 22.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▍ | 1009/1784 [1:26:37<4:54:44, 22.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▍ | 1009/1784 [1:26:37<4:54:44, 22.82s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▍ | 1010/1784 [1:26:40<3:40:23, 17.08s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:03:51,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:03:51,417 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2039, 'learning_rate': 6.05140186915888e-06, 'epoch': 0.57} 57%|██████████████████████████████████████████▌ | 1012/1784 [1:26:48<2:11:10, 10.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▌ | 1012/1784 [1:26:48<2:11:10, 10.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1816, 'learning_rate': 6.0436137071651095e-06, 'epoch': 0.57} 57%|██████████████████████████████████████████▌ | 1012/1784 [1:26:48<2:11:10, 10.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▌ | 1013/1784 [1:26:51<1:45:23, 8.20s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:02,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:02,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2413, 'learning_rate': 6.028037383177571e-06, 'epoch': 0.57} 57%|██████████████████████████████████████████▋ | 1015/1784 [1:26:58<1:14:40, 5.83s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▋ | 1015/1784 [1:26:58<1:14:40, 5.83s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4647, 'learning_rate': 6.020249221183801e-06, 'epoch': 0.57} 57%|██████████████████████████████████████████▋ | 1016/1784 [1:27:02<1:06:00, 5.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████▋ | 1016/1784 [1:27:02<1:06:00, 5.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1725, 'learning_rate': 6.012461059190031e-06, 'epoch': 0.57} 57%|██████████████████████████████████████████▋ | 1016/1784 [1:27:02<1:06:00, 5.16s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|███████████████████████████████████████████▉ | 1017/1784 [1:27:05<59:29, 4.65s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:16,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:16,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2147, 'learning_rate': 5.996884735202493e-06, 'epoch': 0.57} 57%|███████████████████████████████████████████▉ | 1019/1784 [1:27:12<52:18, 4.10s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|███████████████████████████████████████████▉ | 1019/1784 [1:27:12<52:18, 4.10s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.093, 'learning_rate': 5.9890965732087235e-06, 'epoch': 0.57} 57%|████████████████████████████████████████████ | 1020/1784 [1:27:16<49:54, 3.92s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|████████████████████████████████████████████ | 1020/1784 [1:27:16<49:54, 3.92s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:26,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:26,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1716, 'learning_rate': 5.973520249221184e-06, 'epoch': 0.57} 57%|████████████████████████████████████████████ | 1022/1784 [1:27:23<46:31, 3.66s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|████████████████████████████████████████████ | 1022/1784 [1:27:23<46:31, 3.66s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2233, 'learning_rate': 5.965732087227415e-06, 'epoch': 0.57} 57%|████████████████████████████████████████████▏ | 1023/1784 [1:27:26<46:02, 3.63s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|████████████████████████████████████████████▏ | 1023/1784 [1:27:26<46:02, 3.63s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1742, 'learning_rate': 5.957943925233646e-06, 'epoch': 0.57} 57%|████████████████████████████████████████████▏ | 1023/1784 [1:27:26<46:02, 3.63s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|████████████████████████████████████████████▏ | 1024/1784 [1:27:30<45:36, 3.60s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:40,752 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:40,752 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0917, 'learning_rate': 5.942367601246106e-06, 'epoch': 0.57} 58%|████████████████████████████████████████████▎ | 1026/1784 [1:27:37<44:43, 3.54s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▎ | 1026/1784 [1:27:37<44:43, 3.54s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8496, 'learning_rate': 5.9345794392523374e-06, 'epoch': 0.58} 58%|████████████████████████████████████████████▎ | 1026/1784 [1:27:37<44:43, 3.54s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▎ | 1027/1784 [1:27:40<43:33, 3.45s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:50,700 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:50,700 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1411, 'learning_rate': 5.919003115264798e-06, 'epoch': 0.58} 58%|████████████████████████████████████████████▍ | 1029/1784 [1:27:47<42:18, 3.36s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▍ | 1029/1784 [1:27:47<42:18, 3.36s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:57,215 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:04:57,215 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2969, 'learning_rate': 5.903426791277259e-06, 'epoch': 0.58} 58%|████████████████████████████████████████████▍ | 1031/1784 [1:27:53<41:13, 3.28s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▍ | 1031/1784 [1:27:53<41:13, 3.28s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1628, 'learning_rate': 5.89563862928349e-06, 'epoch': 0.58} 58%|████████████████████████████████████████████▍ | 1031/1784 [1:27:53<41:13, 3.28s/it]g-point operations will not be computed-28 10:44:19,802 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▌ | 1032/1784 [1:27:56<40:44, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▌ | 1033/1784 [1:27:59<40:09, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▌ | 1033/1784 [1:27:59<40:09, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:09,740 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:09,740 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9411, 'learning_rate': 5.872274143302181e-06, 'epoch': 0.58} 58%|████████████████████████████████████████████▋ | 1035/1784 [1:28:05<39:07, 3.13s/it]g-point operations will not be computed-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▋ | 1035/1784 [1:28:05<39:07, 3.13s/it]g-point operations will not be computed-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.133, 'learning_rate': 5.864485981308412e-06, 'epoch': 0.58} 58%|████████████████████████████████████████████▋ | 1035/1784 [1:28:05<39:07, 3.13s/it]g-point operations will not be computed-28 11:05:05,219 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▋ | 1036/1784 [1:28:08<38:28, 3.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:17,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▊ | 1037/1784 [1:28:11<37:51, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:17,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▊ | 1037/1784 [1:28:11<37:51, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:17,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:21,651 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:17,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:21,651 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:17,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2407, 'learning_rate': 5.841121495327103e-06, 'epoch': 0.58} 58%|████████████████████████████████████████████▊ | 1039/1784 [1:28:17<36:07, 2.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▊ | 1039/1784 [1:28:17<36:07, 2.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▉ | 1040/1784 [1:28:20<34:40, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|████████████████████████████████████████████▉ | 1040/1784 [1:28:20<34:40, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:29,283 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:29,283 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:31,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:31,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:33,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:33,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:35,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:35,526 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:37,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:37,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:38,824 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:38,824 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2953, 'learning_rate': 5.778816199376948e-06, 'epoch': 0.59} [WARNING|modeling_utils.py:388] 2022-02-28 11:05:41,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:41,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:42,914 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:42,914 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:44,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:44,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 5.1358, 'learning_rate': 5.747663551401869e-06, 'epoch': 0.59} [WARNING|modeling_utils.py:388] 2022-02-28 11:05:48,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:48,664 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:52,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:52,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:05:52,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▍ | 1053/1784 [1:28:49<37:41, 3.09s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▍ | 1053/1784 [1:28:49<37:41, 3.09s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.062, 'learning_rate': 5.724299065420561e-06, 'epoch': 0.59} 59%|█████████████████████████████████████████████▍ | 1054/1784 [1:28:53<40:19, 3.31s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▍ | 1054/1784 [1:28:53<40:19, 3.31s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9202, 'learning_rate': 5.716510903426792e-06, 'epoch': 0.59} 59%|█████████████████████████████████████████████▌ | 1055/1784 [1:28:57<41:55, 3.45s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▌ | 1055/1784 [1:28:57<41:55, 3.45s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0609, 'learning_rate': 5.708722741433023e-06, 'epoch': 0.59} 59%|█████████████████████████████████████████████▌ | 1056/1784 [1:29:00<42:53, 3.53s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▌ | 1056/1784 [1:29:00<42:53, 3.53s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1416, 'learning_rate': 5.700934579439253e-06, 'epoch': 0.59} 59%|█████████████████████████████████████████████▌ | 1057/1784 [1:29:04<43:12, 3.57s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▌ | 1057/1784 [1:29:04<43:12, 3.57s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3627, 'learning_rate': 5.693146417445483e-06, 'epoch': 0.59} 59%|█████████████████████████████████████████████▌ | 1057/1784 [1:29:04<43:12, 3.57s/it]g-point operations will not be computed-28 11:05:25,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▋ | 1058/1784 [1:29:08<43:05, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▋ | 1059/1784 [1:29:11<43:19, 3.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▋ | 1059/1784 [1:29:11<43:19, 3.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1857, 'learning_rate': 5.6775700934579444e-06, 'epoch': 0.59} 59%|█████████████████████████████████████████████▊ | 1060/1784 [1:29:15<43:27, 3.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▊ | 1060/1784 [1:29:15<43:27, 3.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0396, 'learning_rate': 5.669781931464174e-06, 'epoch': 0.59} 59%|█████████████████████████████████████████████▊ | 1061/1784 [1:29:18<43:13, 3.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|█████████████████████████████████████████████▊ | 1061/1784 [1:29:18<43:13, 3.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0697, 'learning_rate': 5.661993769470406e-06, 'epoch': 0.59} 60%|█████████████████████████████████████████████▊ | 1062/1784 [1:29:22<42:56, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|█████████████████████████████████████████████▊ | 1062/1784 [1:29:22<42:56, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:06:32,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:06:32,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3237, 'learning_rate': 5.646417445482867e-06, 'epoch': 0.6} 60%|█████████████████████████████████████████████▉ | 1064/1784 [1:29:29<42:18, 3.53s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|█████████████████████████████████████████████▉ | 1064/1784 [1:29:29<42:18, 3.53s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8274, 'learning_rate': 5.6386292834890964e-06, 'epoch': 0.6} 60%|█████████████████████████████████████████████▉ | 1065/1784 [1:29:32<42:05, 3.51s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|█████████████████████████████████████████████▉ | 1065/1784 [1:29:32<42:05, 3.51s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:06:43,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:06:43,119 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8641, 'learning_rate': 5.623052959501558e-06, 'epoch': 0.6} 60%|██████████████████████████████████████████████ | 1067/1784 [1:29:39<41:37, 3.48s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████ | 1067/1784 [1:29:39<41:37, 3.48s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2398, 'learning_rate': 5.615264797507789e-06, 'epoch': 0.6} 60%|██████████████████████████████████████████████ | 1068/1784 [1:29:43<41:32, 3.48s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████ | 1068/1784 [1:29:43<41:32, 3.48s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:06:53,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:06:53,424 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.355, 'learning_rate': 5.599688473520249e-06, 'epoch': 0.6} 60%|██████████████████████████████████████████████▏ | 1070/1784 [1:29:49<40:47, 3.43s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▏ | 1070/1784 [1:29:49<40:47, 3.43s/it]g-point operations will not be computed-28 11:06:16,722 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9519, 'learning_rate': 5.591900311526481e-06, 'epoch': 0.6} 60%|██████████████████████████████████████████████▏ | 1071/1784 [1:29:53<40:30, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▏ | 1071/1784 [1:29:53<40:30, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▎ | 1072/1784 [1:29:56<40:14, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▎ | 1072/1784 [1:29:56<40:14, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9147, 'learning_rate': 5.576323987538941e-06, 'epoch': 0.6} 60%|██████████████████████████████████████████████▎ | 1073/1784 [1:30:00<39:56, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▎ | 1073/1784 [1:30:00<39:56, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:10,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:10,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:10,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▍ | 1075/1784 [1:30:06<39:22, 3.33s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▍ | 1075/1784 [1:30:06<39:22, 3.33s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1326, 'learning_rate': 5.5529595015576335e-06, 'epoch': 0.6} 60%|██████████████████████████████████████████████▍ | 1075/1784 [1:30:06<39:22, 3.33s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▍ | 1076/1784 [1:30:09<38:47, 3.29s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▍ | 1076/1784 [1:30:09<38:47, 3.29s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:19,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:19,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:19,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|██████████████████████████████████████████████▌ | 1078/1784 [1:30:16<38:22, 3.26s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:26,304 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:26,304 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1318, 'learning_rate': 5.521806853582555e-06, 'epoch': 0.6} [WARNING|modeling_utils.py:388] 2022-02-28 11:07:26,304 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▌ | 1080/1784 [1:30:22<37:38, 3.21s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▌ | 1080/1784 [1:30:22<37:38, 3.21s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▌ | 1080/1784 [1:30:22<37:38, 3.21s/it]g-point operations will not be computed-28 11:07:01,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▋ | 1081/1784 [1:30:25<37:16, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:34,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▋ | 1081/1784 [1:30:25<37:16, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:34,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▋ | 1082/1784 [1:30:28<37:02, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:34,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▋ | 1082/1784 [1:30:28<37:02, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:34,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▋ | 1082/1784 [1:30:28<37:02, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:34,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▋ | 1083/1784 [1:30:31<36:31, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:40,215 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▋ | 1083/1784 [1:30:31<36:31, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:40,215 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▊ | 1084/1784 [1:30:34<35:49, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▊ | 1085/1784 [1:30:37<35:13, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▊ | 1085/1784 [1:30:37<35:13, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0747, 'learning_rate': 5.4750778816199375e-06, 'epoch': 0.61} 61%|██████████████████████████████████████████████▊ | 1085/1784 [1:30:37<35:13, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▊ | 1086/1784 [1:30:40<34:48, 2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▊ | 1086/1784 [1:30:40<34:48, 2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:50,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:50,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:07:50,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:43,163 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▉ | 1088/1784 [1:30:46<33:57, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:54,610 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|██████████████████████████████████████████████▉ | 1088/1784 [1:30:46<33:57, 2.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:54,610 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|███████████████████████████████████████████████ | 1089/1784 [1:30:49<33:01, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|███████████████████████████████████████████████ | 1089/1784 [1:30:49<33:01, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|███████████████████████████████████████████████ | 1090/1784 [1:30:51<31:59, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|███████████████████████████████████████████████ | 1090/1784 [1:30:51<31:59, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:00,872 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:03,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:03,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:05,222 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:05,222 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:07,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:07,162 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2861, 'learning_rate': 5.404984423676013e-06, 'epoch': 0.61} [WARNING|modeling_utils.py:388] 2022-02-28 11:08:08,948 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:08,948 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:10,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:10,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:13,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:13,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9714, 'learning_rate': 5.373831775700935e-06, 'epoch': 0.62} [WARNING|modeling_utils.py:388] 2022-02-28 11:08:14,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:14,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:16,644 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:16,644 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:16,644 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:20,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:20,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:20,631 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▌ | 1102/1784 [1:31:17<31:44, 2.79s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▌ | 1102/1784 [1:31:17<31:44, 2.79s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▌ | 1102/1784 [1:31:17<31:44, 2.79s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▌ | 1103/1784 [1:31:21<35:02, 3.09s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:31,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:31,877 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9857, 'learning_rate': 5.3271028037383174e-06, 'epoch': 0.62} 62%|███████████████████████████████████████████████▋ | 1105/1784 [1:31:28<38:27, 3.40s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▋ | 1105/1784 [1:31:28<38:27, 3.40s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9589, 'learning_rate': 5.319314641744549e-06, 'epoch': 0.62} 62%|███████████████████████████████████████████████▋ | 1105/1784 [1:31:28<38:27, 3.40s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▋ | 1106/1784 [1:31:32<39:19, 3.48s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▋ | 1106/1784 [1:31:32<39:19, 3.48s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▋ | 1106/1784 [1:31:32<39:19, 3.48s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▊ | 1107/1784 [1:31:36<39:48, 3.53s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▊ | 1107/1784 [1:31:36<39:48, 3.53s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▊ | 1107/1784 [1:31:36<39:48, 3.53s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▊ | 1108/1784 [1:31:39<40:03, 3.56s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:08:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2659, 'learning_rate': 5.28816199376947e-06, 'epoch': 0.62} [WARNING|modeling_utils.py:388] 2022-02-28 11:08:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▉ | 1110/1784 [1:31:46<40:01, 3.56s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▉ | 1110/1784 [1:31:46<40:01, 3.56s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▉ | 1110/1784 [1:31:46<40:01, 3.56s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▉ | 1111/1784 [1:31:50<39:49, 3.55s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▉ | 1111/1784 [1:31:50<39:49, 3.55s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▉ | 1111/1784 [1:31:50<39:49, 3.55s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|███████████████████████████████████████████████▉ | 1112/1784 [1:31:53<39:39, 3.54s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:04,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:04,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8206, 'learning_rate': 5.2570093457943925e-06, 'epoch': 0.62} [WARNING|modeling_utils.py:388] 2022-02-28 11:09:04,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|████████████████████████████████████████████████ | 1114/1784 [1:32:01<39:40, 3.55s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|████████████████████████████████████████████████ | 1114/1784 [1:32:01<39:40, 3.55s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|████████████████████████████████████████████████ | 1114/1784 [1:32:01<39:40, 3.55s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|████████████████████████████████████████████████▏ | 1115/1784 [1:32:04<39:25, 3.54s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:14,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:14,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2154, 'learning_rate': 5.233644859813084e-06, 'epoch': 0.63} [WARNING|modeling_utils.py:388] 2022-02-28 11:09:14,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▏ | 1117/1784 [1:32:11<38:34, 3.47s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▏ | 1117/1784 [1:32:11<38:34, 3.47s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▏ | 1117/1784 [1:32:11<38:34, 3.47s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▎ | 1118/1784 [1:32:14<38:00, 3.42s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:24,873 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:24,873 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1081, 'learning_rate': 5.210280373831777e-06, 'epoch': 0.63} 63%|████████████████████████████████████████████████▎ | 1120/1784 [1:32:21<37:45, 3.41s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▎ | 1120/1784 [1:32:21<37:45, 3.41s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0098, 'learning_rate': 5.2024922118380065e-06, 'epoch': 0.63} 63%|████████████████████████████████████████████████▎ | 1120/1784 [1:32:21<37:45, 3.41s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▍ | 1121/1784 [1:32:24<37:37, 3.41s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:34,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:34,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1207, 'learning_rate': 5.186915887850468e-06, 'epoch': 0.63} [WARNING|modeling_utils.py:388] 2022-02-28 11:09:34,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▍ | 1123/1784 [1:32:31<36:52, 3.35s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▍ | 1123/1784 [1:32:31<36:52, 3.35s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▍ | 1123/1784 [1:32:31<36:52, 3.35s/it]g-point operations will not be computed-28 11:07:57,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▌ | 1124/1784 [1:32:34<36:38, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▌ | 1124/1784 [1:32:34<36:38, 3.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▌ | 1125/1784 [1:32:37<36:19, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▌ | 1125/1784 [1:32:37<36:19, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▌ | 1125/1784 [1:32:37<36:19, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▌ | 1126/1784 [1:32:41<35:58, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:51,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:51,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0456, 'learning_rate': 5.1479750778816205e-06, 'epoch': 0.63} [WARNING|modeling_utils.py:388] 2022-02-28 11:09:51,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▋ | 1128/1784 [1:32:47<35:29, 3.25s/it]g-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:57,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:09:57,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8169, 'learning_rate': 5.132398753894081e-06, 'epoch': 0.63} [WARNING|modeling_utils.py:388] 2022-02-28 11:09:57,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1130/1784 [1:32:53<35:00, 3.21s/it]g-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1130/1784 [1:32:53<35:00, 3.21s/it]g-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1130/1784 [1:32:53<35:00, 3.21s/it]g-point operations will not be computed-28 11:09:43,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1131/1784 [1:32:57<34:36, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:05,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1131/1784 [1:32:57<34:36, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:05,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1132/1784 [1:33:00<34:11, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:05,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1132/1784 [1:33:00<34:11, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:05,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|████████████████████████████████████████████████▊ | 1132/1784 [1:33:00<34:11, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:05,493 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████▉ | 1133/1784 [1:33:03<33:54, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:11,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████▉ | 1133/1784 [1:33:03<33:54, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:11,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████▉ | 1134/1784 [1:33:06<33:38, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:11,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████▉ | 1134/1784 [1:33:06<33:38, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:11,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████▉ | 1134/1784 [1:33:06<33:38, 3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:11,645 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████▉ | 1135/1784 [1:33:09<33:10, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████▉ | 1135/1784 [1:33:09<33:10, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████ | 1136/1784 [1:33:12<32:32, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:21,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:21,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1847, 'learning_rate': 5.070093457943925e-06, 'epoch': 0.64} [WARNING|modeling_utils.py:388] 2022-02-28 11:10:21,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████ | 1138/1784 [1:33:17<31:35, 2.93s/it]g-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████ | 1138/1784 [1:33:17<31:35, 2.93s/it]g-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:27,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:30,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:30,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1432, 'learning_rate': 5.046728971962617e-06, 'epoch': 0.64} [WARNING|modeling_utils.py:388] 2022-02-28 11:10:30,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:17,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████▏ | 1141/1784 [1:33:25<29:30, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:34,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████▏ | 1141/1784 [1:33:25<29:30, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:34,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████▎ | 1142/1784 [1:33:28<28:32, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████▎ | 1142/1784 [1:33:28<28:32, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████▎ | 1143/1784 [1:33:30<27:29, 2.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|█████████████████████████████████████████████████▎ | 1143/1784 [1:33:30<27:29, 2.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:39,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:39,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:41,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:41,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:43,707 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:43,707 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:46,733 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:46,733 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0697, 'learning_rate': 4.9844236760124615e-06, 'epoch': 0.64} [WARNING|modeling_utils.py:388] 2022-02-28 11:10:47,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:47,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:49,701 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:49,701 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:49,701 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:53,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:53,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:10:53,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▋ | 1152/1784 [1:33:50<29:26, 2.80s/it]g-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▋ | 1152/1784 [1:33:50<29:26, 2.80s/it]g-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▋ | 1152/1784 [1:33:50<29:26, 2.80s/it]g-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▊ | 1153/1784 [1:33:54<32:23, 3.08s/it]g-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▊ | 1153/1784 [1:33:54<32:23, 3.08s/it]g-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▊ | 1153/1784 [1:33:54<32:23, 3.08s/it]g-point operations will not be computed-28 11:10:36,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▊ | 1154/1784 [1:33:58<34:14, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▊ | 1155/1784 [1:34:01<35:17, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▊ | 1155/1784 [1:34:01<35:17, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2972, 'learning_rate': 4.9299065420560755e-06, 'epoch': 0.65} 65%|█████████████████████████████████████████████████▊ | 1155/1784 [1:34:01<35:17, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▉ | 1156/1784 [1:34:05<35:55, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▉ | 1156/1784 [1:34:05<35:55, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▉ | 1156/1784 [1:34:05<35:55, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|█████████████████████████████████████████████████▉ | 1157/1784 [1:34:08<36:30, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:19,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:19,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1679, 'learning_rate': 4.906542056074766e-06, 'epoch': 0.65} 65%|██████████████████████████████████████████████████ | 1159/1784 [1:34:16<36:59, 3.55s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████ | 1159/1784 [1:34:16<36:59, 3.55s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0246, 'learning_rate': 4.898753894080998e-06, 'epoch': 0.65} 65%|██████████████████████████████████████████████████ | 1159/1784 [1:34:16<36:59, 3.55s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████ | 1160/1784 [1:34:19<37:00, 3.56s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████ | 1160/1784 [1:34:19<37:00, 3.56s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████ | 1160/1784 [1:34:19<37:00, 3.56s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████ | 1161/1784 [1:34:23<36:42, 3.54s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:33,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:33,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1444, 'learning_rate': 4.875389408099689e-06, 'epoch': 0.65} 65%|██████████████████████████████████████████████████▏ | 1163/1784 [1:34:30<36:25, 3.52s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████▏ | 1163/1784 [1:34:30<36:25, 3.52s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1607, 'learning_rate': 4.86760124610592e-06, 'epoch': 0.65} 65%|██████████████████████████████████████████████████▏ | 1163/1784 [1:34:30<36:25, 3.52s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████▏ | 1164/1784 [1:34:33<36:21, 3.52s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:44,019 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:44,019 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1034, 'learning_rate': 4.85202492211838e-06, 'epoch': 0.65} 65%|██████████████████████████████████████████████████▎ | 1166/1784 [1:34:40<35:58, 3.49s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████▎ | 1166/1784 [1:34:40<35:58, 3.49s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2553, 'learning_rate': 4.844236760124611e-06, 'epoch': 0.65} 65%|██████████████████████████████████████████████████▎ | 1166/1784 [1:34:40<35:58, 3.49s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████▎ | 1167/1784 [1:34:44<35:45, 3.48s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████▎ | 1167/1784 [1:34:44<35:45, 3.48s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████▎ | 1167/1784 [1:34:44<35:45, 3.48s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|██████████████████████████████████████████████████▍ | 1168/1784 [1:34:47<35:34, 3.46s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:57,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:11:57,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2794, 'learning_rate': 4.820872274143303e-06, 'epoch': 0.66} [WARNING|modeling_utils.py:388] 2022-02-28 11:11:57,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▍ | 1170/1784 [1:34:54<35:10, 3.44s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▍ | 1170/1784 [1:34:54<35:10, 3.44s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▍ | 1170/1784 [1:34:54<35:10, 3.44s/it]g-point operations will not be computed-28 11:11:06,781 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▌ | 1171/1784 [1:34:57<34:54, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▌ | 1172/1784 [1:35:01<34:34, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▌ | 1172/1784 [1:35:01<34:34, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9968, 'learning_rate': 4.797507788161994e-06, 'epoch': 0.66} 66%|██████████████████████████████████████████████████▌ | 1172/1784 [1:35:01<34:34, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▋ | 1173/1784 [1:35:04<34:24, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:14,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:14,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0179, 'learning_rate': 4.7819314641744554e-06, 'epoch': 0.66} 66%|██████████████████████████████████████████████████▋ | 1175/1784 [1:35:10<33:51, 3.34s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▋ | 1175/1784 [1:35:10<33:51, 3.34s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9366, 'learning_rate': 4.774143302180686e-06, 'epoch': 0.66} 66%|██████████████████████████████████████████████████▋ | 1175/1784 [1:35:10<33:51, 3.34s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▊ | 1176/1784 [1:35:14<33:38, 3.32s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:24,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:24,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2946, 'learning_rate': 4.758566978193147e-06, 'epoch': 0.66} [WARNING|modeling_utils.py:388] 2022-02-28 11:12:24,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▊ | 1178/1784 [1:35:20<33:02, 3.27s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:30,741 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:30,741 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8046, 'learning_rate': 4.742990654205608e-06, 'epoch': 0.66} [WARNING|modeling_utils.py:388] 2022-02-28 11:12:30,741 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▉ | 1180/1784 [1:35:27<32:29, 3.23s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▉ | 1180/1784 [1:35:27<32:29, 3.23s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▉ | 1180/1784 [1:35:27<32:29, 3.23s/it]g-point operations will not be computed-28 11:12:06,271 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▉ | 1181/1784 [1:35:30<32:01, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|██████████████████████████████████████████████████▉ | 1181/1784 [1:35:30<32:01, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|███████████████████████████████████████████████████ | 1182/1784 [1:35:33<31:31, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:43,096 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:43,096 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0093, 'learning_rate': 4.7118380062305305e-06, 'epoch': 0.66} [WARNING|modeling_utils.py:388] 2022-02-28 11:12:43,096 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|███████████████████████████████████████████████████ | 1184/1784 [1:35:39<30:45, 3.08s/it]g-point operations will not be computed-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:49,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:49,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5318, 'learning_rate': 4.696261682242992e-06, 'epoch': 0.66} [WARNING|modeling_utils.py:388] 2022-02-28 11:12:49,031 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:38,579 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|███████████████████████████████████████████████████▏ | 1186/1784 [1:35:45<29:54, 3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:53,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▏ | 1187/1784 [1:35:47<29:20, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:53,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▏ | 1187/1784 [1:35:47<29:20, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:12:53,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:57,483 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:53,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:12:57,483 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:53,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0963, 'learning_rate': 4.6728971962616825e-06, 'epoch': 0.67} [WARNING|modeling_utils.py:388] 2022-02-28 11:12:57,483 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:12:53,425 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▎ | 1189/1784 [1:35:53<27:57, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▎ | 1189/1784 [1:35:53<27:57, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▎ | 1190/1784 [1:35:55<27:18, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:05,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:05,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:07,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:07,529 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:09,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:09,744 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:11,754 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:11,754 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:13,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:13,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:15,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:15,237 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0301, 'learning_rate': 4.610591900311527e-06, 'epoch': 0.67} [WARNING|modeling_utils.py:388] 2022-02-28 11:13:16,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:16,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:19,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:19,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:21,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:21,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0286, 'learning_rate': 4.579439252336449e-06, 'epoch': 0.67} [WARNING|modeling_utils.py:388] 2022-02-28 11:13:21,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:24,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:24,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:24,988 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▉ | 1202/1784 [1:36:21<26:42, 2.75s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▉ | 1202/1784 [1:36:21<26:42, 2.75s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▉ | 1202/1784 [1:36:21<26:42, 2.75s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▉ | 1203/1784 [1:36:25<29:34, 3.05s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▉ | 1203/1784 [1:36:25<29:34, 3.05s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▉ | 1203/1784 [1:36:25<29:34, 3.05s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|███████████████████████████████████████████████████▉ | 1204/1784 [1:36:29<31:24, 3.25s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:39,888 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:39,888 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9195, 'learning_rate': 4.540498442367602e-06, 'epoch': 0.68} 68%|████████████████████████████████████████████████████ | 1206/1784 [1:36:36<33:09, 3.44s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████ | 1206/1784 [1:36:36<33:09, 3.44s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2063, 'learning_rate': 4.532710280373832e-06, 'epoch': 0.68} 68%|████████████████████████████████████████████████████ | 1206/1784 [1:36:36<33:09, 3.44s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████ | 1207/1784 [1:36:40<33:33, 3.49s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████ | 1207/1784 [1:36:40<33:33, 3.49s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████ | 1207/1784 [1:36:40<33:33, 3.49s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▏ | 1208/1784 [1:36:43<33:48, 3.52s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:54,272 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:13:54,272 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0203, 'learning_rate': 4.509345794392524e-06, 'epoch': 0.68} [WARNING|modeling_utils.py:388] 2022-02-28 11:13:54,272 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▏ | 1210/1784 [1:36:50<33:40, 3.52s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▏ | 1210/1784 [1:36:50<33:40, 3.52s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▏ | 1210/1784 [1:36:50<33:40, 3.52s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▎ | 1211/1784 [1:36:54<33:29, 3.51s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▎ | 1211/1784 [1:36:54<33:29, 3.51s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▎ | 1211/1784 [1:36:54<33:29, 3.51s/it]g-point operations will not be computed-28 11:13:01,549 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▎ | 1212/1784 [1:36:57<33:21, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▎ | 1213/1784 [1:37:01<33:20, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▎ | 1213/1784 [1:37:01<33:20, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2201, 'learning_rate': 4.478193146417446e-06, 'epoch': 0.68} 68%|████████████████████████████████████████████████████▎ | 1213/1784 [1:37:01<33:20, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▍ | 1214/1784 [1:37:04<33:14, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▍ | 1214/1784 [1:37:04<33:14, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▍ | 1214/1784 [1:37:04<33:14, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▍ | 1215/1784 [1:37:08<33:04, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:14:18,605 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:14:18,605 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1333, 'learning_rate': 4.4548286604361376e-06, 'epoch': 0.68} [WARNING|modeling_utils.py:388] 2022-02-28 11:14:18,605 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▌ | 1217/1784 [1:37:15<32:38, 3.45s/it]g-point operations will not be computed-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▌ | 1217/1784 [1:37:15<32:38, 3.45s/it]g-point operations will not be computed-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▌ | 1217/1784 [1:37:15<32:38, 3.45s/it]g-point operations will not be computed-28 11:14:06,528 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▌ | 1218/1784 [1:37:18<32:29, 3.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:27,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▌ | 1219/1784 [1:37:22<32:18, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:27,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▌ | 1219/1784 [1:37:22<32:18, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:27,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9206, 'learning_rate': 4.431464174454829e-06, 'epoch': 0.68} 68%|████████████████████████████████████████████████████▌ | 1219/1784 [1:37:22<32:18, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:27,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▋ | 1220/1784 [1:37:25<32:07, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:27,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▋ | 1220/1784 [1:37:25<32:07, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:27,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▋ | 1220/1784 [1:37:25<32:07, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:27,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▋ | 1221/1784 [1:37:28<31:53, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▋ | 1222/1784 [1:37:32<31:41, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|████████████████████████████████████████████████████▋ | 1222/1784 [1:37:32<31:41, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9109, 'learning_rate': 4.40809968847352e-06, 'epoch': 0.68} 68%|████████████████████████████████████████████████████▋ | 1222/1784 [1:37:32<31:41, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████▊ | 1223/1784 [1:37:35<31:27, 3.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:14:45,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:14:45,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0382, 'learning_rate': 4.392523364485981e-06, 'epoch': 0.69} [WARNING|modeling_utils.py:388] 2022-02-28 11:14:45,498 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████▊ | 1225/1784 [1:37:41<30:44, 3.30s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████▊ | 1225/1784 [1:37:41<30:44, 3.30s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████▊ | 1225/1784 [1:37:41<30:44, 3.30s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████▉ | 1226/1784 [1:37:45<30:15, 3.25s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:14:55,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:14:55,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1069, 'learning_rate': 4.369158878504674e-06, 'epoch': 0.69} [WARNING|modeling_utils.py:388] 2022-02-28 11:14:55,063 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████ | 1228/1784 [1:37:51<29:54, 3.23s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:01,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:01,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1242, 'learning_rate': 4.353582554517134e-06, 'epoch': 0.69} [WARNING|modeling_utils.py:388] 2022-02-28 11:15:01,452 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████ | 1230/1784 [1:37:57<29:36, 3.21s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:07,821 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:07,821 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1737, 'learning_rate': 4.338006230529595e-06, 'epoch': 0.69} [WARNING|modeling_utils.py:388] 2022-02-28 11:15:07,821 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▏ | 1232/1784 [1:38:04<29:19, 3.19s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:14,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:14,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1197, 'learning_rate': 4.322429906542056e-06, 'epoch': 0.69} [WARNING|modeling_utils.py:388] 2022-02-28 11:15:14,065 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▎ | 1234/1784 [1:38:10<28:42, 3.13s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▎ | 1234/1784 [1:38:10<28:42, 3.13s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▎ | 1234/1784 [1:38:10<28:42, 3.13s/it]g-point operations will not be computed-28 11:14:37,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▎ | 1235/1784 [1:38:13<28:24, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▎ | 1235/1784 [1:38:13<28:24, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▎ | 1236/1784 [1:38:16<27:54, 3.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:26,009 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:26,009 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0476, 'learning_rate': 4.291277258566979e-06, 'epoch': 0.69} [WARNING|modeling_utils.py:388] 2022-02-28 11:15:26,009 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▍ | 1238/1784 [1:38:21<26:50, 2.95s/it]g-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|█████████████████████████████████████████████████████▍ | 1238/1784 [1:38:21<26:50, 2.95s/it]g-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:31,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:31,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:31,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:21,697 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████▌ | 1240/1784 [1:38:27<25:40, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:35,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████▌ | 1240/1784 [1:38:27<25:40, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:35,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████▌ | 1241/1784 [1:38:30<24:53, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████▌ | 1241/1784 [1:38:30<24:53, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████▌ | 1242/1784 [1:38:32<24:01, 2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████▌ | 1242/1784 [1:38:32<24:01, 2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:41,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:41,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:43,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:43,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:45,612 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:45,612 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:47,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:47,450 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:50,505 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:50,505 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2278, 'learning_rate': 4.205607476635514e-06, 'epoch': 0.7} [WARNING|modeling_utils.py:388] 2022-02-28 11:15:51,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:51,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:53,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:53,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:53,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:57,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:57,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:15:57,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████ | 1252/1784 [1:38:54<24:51, 2.80s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████ | 1252/1784 [1:38:54<24:51, 2.80s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████ | 1252/1784 [1:38:54<24:51, 2.80s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████ | 1253/1784 [1:38:58<27:21, 3.09s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:08,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:08,766 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2098, 'learning_rate': 4.1588785046728974e-06, 'epoch': 0.7} 70%|██████████████████████████████████████████████████████▏ | 1255/1784 [1:39:05<29:54, 3.39s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████▏ | 1255/1784 [1:39:05<29:54, 3.39s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3067, 'learning_rate': 4.151090342679128e-06, 'epoch': 0.7} 70%|██████████████████████████████████████████████████████▏ | 1255/1784 [1:39:05<29:54, 3.39s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████▏ | 1256/1784 [1:39:09<30:30, 3.47s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████▏ | 1256/1784 [1:39:09<30:30, 3.47s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████▏ | 1256/1784 [1:39:09<30:30, 3.47s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████▎ | 1257/1784 [1:39:12<30:54, 3.52s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████▎ | 1257/1784 [1:39:12<30:54, 3.52s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|██████████████████████████████████████████████████████▎ | 1257/1784 [1:39:12<30:54, 3.52s/it]g-point operations will not be computed-28 11:15:38,143 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▎ | 1258/1784 [1:39:16<31:04, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▎ | 1259/1784 [1:39:20<31:11, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▎ | 1259/1784 [1:39:20<31:11, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1094, 'learning_rate': 4.11993769470405e-06, 'epoch': 0.71} 71%|██████████████████████████████████████████████████████▎ | 1259/1784 [1:39:20<31:11, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▍ | 1260/1784 [1:39:23<31:12, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▍ | 1260/1784 [1:39:23<31:12, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▍ | 1260/1784 [1:39:23<31:12, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▍ | 1261/1784 [1:39:27<30:55, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:37,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:37,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2195, 'learning_rate': 4.096573208722742e-06, 'epoch': 0.71} [WARNING|modeling_utils.py:388] 2022-02-28 11:16:37,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▌ | 1263/1784 [1:39:34<30:36, 3.53s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▌ | 1263/1784 [1:39:34<30:36, 3.53s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▌ | 1263/1784 [1:39:34<30:36, 3.53s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▌ | 1264/1784 [1:39:37<30:23, 3.51s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:47,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:47,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2486, 'learning_rate': 4.073208722741434e-06, 'epoch': 0.71} 71%|██████████████████████████████████████████████████████▋ | 1266/1784 [1:39:44<29:58, 3.47s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▋ | 1266/1784 [1:39:44<29:58, 3.47s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2184, 'learning_rate': 4.065420560747663e-06, 'epoch': 0.71} 71%|██████████████████████████████████████████████████████▋ | 1266/1784 [1:39:44<29:58, 3.47s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▋ | 1267/1784 [1:39:47<29:43, 3.45s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:58,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:16:58,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0496, 'learning_rate': 4.0498442367601245e-06, 'epoch': 0.71} [WARNING|modeling_utils.py:388] 2022-02-28 11:16:58,203 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▊ | 1269/1784 [1:39:54<29:27, 3.43s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▊ | 1269/1784 [1:39:54<29:27, 3.43s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▊ | 1269/1784 [1:39:54<29:27, 3.43s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▊ | 1270/1784 [1:39:58<29:15, 3.42s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:08,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:08,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0178, 'learning_rate': 4.026479750778817e-06, 'epoch': 0.71} [WARNING|modeling_utils.py:388] 2022-02-28 11:17:08,346 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▉ | 1272/1784 [1:40:04<28:46, 3.37s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▉ | 1272/1784 [1:40:04<28:46, 3.37s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▉ | 1272/1784 [1:40:04<28:46, 3.37s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|██████████████████████████████████████████████████████▉ | 1273/1784 [1:40:08<28:39, 3.37s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:18,280 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:18,280 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0959, 'learning_rate': 4.003115264797508e-06, 'epoch': 0.71} [WARNING|modeling_utils.py:388] 2022-02-28 11:17:18,280 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|███████████████████████████████████████████████████████ | 1275/1784 [1:40:14<28:14, 3.33s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|███████████████████████████████████████████████████████ | 1275/1784 [1:40:14<28:14, 3.33s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|███████████████████████████████████████████████████████ | 1275/1784 [1:40:14<28:14, 3.33s/it]g-point operations will not be computed-28 11:16:25,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████ | 1276/1784 [1:40:18<28:05, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████ | 1276/1784 [1:40:18<28:05, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████ | 1277/1784 [1:40:21<27:49, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████ | 1277/1784 [1:40:21<27:49, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████ | 1277/1784 [1:40:21<27:49, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▏ | 1278/1784 [1:40:24<27:34, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:34,500 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:34,500 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3166, 'learning_rate': 3.964174454828661e-06, 'epoch': 0.72} [WARNING|modeling_utils.py:388] 2022-02-28 11:17:34,500 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▏ | 1280/1784 [1:40:30<27:00, 3.22s/it]g-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▏ | 1280/1784 [1:40:30<27:00, 3.22s/it]g-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:40,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:40,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:40,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▎ | 1282/1784 [1:40:37<26:23, 3.15s/it]g-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:46,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:46,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3166, 'learning_rate': 3.933021806853583e-06, 'epoch': 0.72} [WARNING|modeling_utils.py:388] 2022-02-28 11:17:46,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▍ | 1284/1784 [1:40:43<25:52, 3.10s/it]g-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:52,918 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:17:52,918 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.124, 'learning_rate': 3.917445482866044e-06, 'epoch': 0.72} [WARNING|modeling_utils.py:388] 2022-02-28 11:17:52,918 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:26,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▌ | 1286/1784 [1:40:49<25:08, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▌ | 1286/1784 [1:40:49<25:08, 3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▌ | 1287/1784 [1:40:51<24:45, 2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:01,634 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:01,634 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.138, 'learning_rate': 3.894080996884735e-06, 'epoch': 0.72} [WARNING|modeling_utils.py:388] 2022-02-28 11:18:01,634 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▋ | 1289/1784 [1:40:57<23:51, 2.89s/it]g-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▋ | 1289/1784 [1:40:57<23:51, 2.89s/it]g-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:07,083 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:09,663 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:09,663 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2893, 'learning_rate': 3.8707165109034276e-06, 'epoch': 0.72} [WARNING|modeling_utils.py:388] 2022-02-28 11:18:09,663 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:17:57,393 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▊ | 1292/1784 [1:41:05<21:50, 2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:13,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▊ | 1292/1784 [1:41:05<21:50, 2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:13,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▊ | 1293/1784 [1:41:07<20:41, 2.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:15,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|███████████████████████████████████████████████████████▊ | 1293/1784 [1:41:07<20:41, 2.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:15,437 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████▊ | 1294/1784 [1:41:09<19:32, 2.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:17,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████▊ | 1294/1784 [1:41:09<19:32, 2.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:17,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████▉ | 1295/1784 [1:41:11<18:29, 2.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:19,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████▉ | 1295/1784 [1:41:11<18:29, 2.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:19,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████▉ | 1297/1784 [1:41:14<16:00, 1.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:21,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████▉ | 1297/1784 [1:41:14<16:00, 1.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:21,020 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0977, 'learning_rate': 3.82398753894081e-06, 'epoch': 0.73} 73%|████████████████████████████████████████████████████████ | 1298/1784 [1:41:16<14:52, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:24,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████ | 1298/1784 [1:41:16<14:52, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:24,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████ | 1299/1784 [1:41:17<13:31, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:25,248 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████ | 1299/1784 [1:41:17<13:31, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:25,248 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████ | 1300/1784 [1:41:19<13:57, 1.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:25,248 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████ | 1300/1784 [1:41:19<13:57, 1.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████ | 1300/1784 [1:41:19<13:57, 1.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▏ | 1301/1784 [1:41:23<19:12, 2.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:34,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:34,191 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3084, 'learning_rate': 3.785046728971963e-06, 'epoch': 0.73} 73%|████████████████████████████████████████████████████████▏ | 1303/1784 [1:41:31<24:53, 3.10s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▏ | 1303/1784 [1:41:31<24:53, 3.10s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1713, 'learning_rate': 3.7772585669781935e-06, 'epoch': 0.73} 73%|████████████████████████████████████████████████████████▎ | 1304/1784 [1:41:34<26:22, 3.30s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▎ | 1304/1784 [1:41:34<26:22, 3.30s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.177, 'learning_rate': 3.7694704049844237e-06, 'epoch': 0.73} 73%|████████████████████████████████████████████████████████▎ | 1304/1784 [1:41:34<26:22, 3.30s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▎ | 1305/1784 [1:41:38<27:16, 3.42s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▎ | 1305/1784 [1:41:38<27:16, 3.42s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▎ | 1305/1784 [1:41:38<27:16, 3.42s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▎ | 1306/1784 [1:41:42<27:39, 3.47s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:52,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:18:52,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1165, 'learning_rate': 3.746105919003116e-06, 'epoch': 0.73} 73%|████████████████████████████████████████████████████████▍ | 1308/1784 [1:41:49<28:07, 3.55s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▍ | 1308/1784 [1:41:49<28:07, 3.55s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9715, 'learning_rate': 3.738317757009346e-06, 'epoch': 0.73} 73%|████████████████████████████████████████████████████████▍ | 1309/1784 [1:41:52<28:07, 3.55s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▍ | 1309/1784 [1:41:52<28:07, 3.55s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.139, 'learning_rate': 3.730529595015577e-06, 'epoch': 0.73} 73%|████████████████████████████████████████████████████████▍ | 1309/1784 [1:41:52<28:07, 3.55s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|████████████████████████████████████████████████████████▌ | 1310/1784 [1:41:56<28:09, 3.57s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:06,888 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:06,888 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1057, 'learning_rate': 3.7149532710280376e-06, 'epoch': 0.73} 74%|████████████████████████████████████████████████████████▋ | 1312/1784 [1:42:03<27:36, 3.51s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▋ | 1312/1784 [1:42:03<27:36, 3.51s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2007, 'learning_rate': 3.7071651090342682e-06, 'epoch': 0.74} 74%|████████████████████████████████████████████████████████▋ | 1312/1784 [1:42:03<27:36, 3.51s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▋ | 1313/1784 [1:42:06<27:28, 3.50s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:17,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:17,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9657, 'learning_rate': 3.691588785046729e-06, 'epoch': 0.74} 74%|████████████████████████████████████████████████████████▊ | 1315/1784 [1:42:13<26:58, 3.45s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▊ | 1315/1784 [1:42:13<26:58, 3.45s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3177, 'learning_rate': 3.68380062305296e-06, 'epoch': 0.74} 74%|████████████████████████████████████████████████████████▊ | 1315/1784 [1:42:13<26:58, 3.45s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▊ | 1316/1784 [1:42:17<26:49, 3.44s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:27,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:27,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0704, 'learning_rate': 3.668224299065421e-06, 'epoch': 0.74} 74%|████████████████████████████████████████████████████████▉ | 1318/1784 [1:42:24<26:43, 3.44s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▉ | 1318/1784 [1:42:24<26:43, 3.44s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4594, 'learning_rate': 3.660436137071651e-06, 'epoch': 0.74} 74%|████████████████████████████████████████████████████████▉ | 1318/1784 [1:42:24<26:43, 3.44s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▉ | 1319/1784 [1:42:27<26:35, 3.43s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▉ | 1319/1784 [1:42:27<26:35, 3.43s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▉ | 1319/1784 [1:42:27<26:35, 3.43s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████▉ | 1320/1784 [1:42:30<26:31, 3.43s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:41,123 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:41,123 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8676, 'learning_rate': 3.637071651090343e-06, 'epoch': 0.74} [WARNING|modeling_utils.py:388] 2022-02-28 11:19:41,123 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|█████████████████████████████████████████████████████████ | 1322/1784 [1:42:37<26:03, 3.38s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:47,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:47,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0337, 'learning_rate': 3.621495327102804e-06, 'epoch': 0.74} [WARNING|modeling_utils.py:388] 2022-02-28 11:19:47,753 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|█████████████████████████████████████████████████████████▏ | 1324/1784 [1:42:44<25:26, 3.32s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:54,221 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:19:54,221 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0927, 'learning_rate': 3.605919003115265e-06, 'epoch': 0.74} 74%|█████████████████████████████████████████████████████████▏ | 1326/1784 [1:42:50<24:57, 3.27s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|█████████████████████████████████████████████████████████▏ | 1326/1784 [1:42:50<24:57, 3.27s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9976, 'learning_rate': 3.5981308411214953e-06, 'epoch': 0.74} 74%|█████████████████████████████████████████████████████████▏ | 1326/1784 [1:42:50<24:57, 3.27s/it]g-point operations will not be computed-28 11:18:28,540 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|█████████████████████████████████████████████████████████▎ | 1327/1784 [1:42:53<24:44, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|█████████████████████████████████████████████████████████▎ | 1328/1784 [1:42:57<24:39, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|█████████████████████████████████████████████████████████▎ | 1328/1784 [1:42:57<24:39, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.978, 'learning_rate': 3.5825545171339564e-06, 'epoch': 0.74} 74%|█████████████████████████████████████████████████████████▎ | 1328/1784 [1:42:57<24:39, 3.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|█████████████████████████████████████████████████████████▎ | 1329/1784 [1:43:00<24:24, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:10,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:10,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1864, 'learning_rate': 3.5669781931464176e-06, 'epoch': 0.75} [WARNING|modeling_utils.py:388] 2022-02-28 11:20:10,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▍ | 1331/1784 [1:43:06<23:47, 3.15s/it]g-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:16,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:16,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9893, 'learning_rate': 3.5514018691588787e-06, 'epoch': 0.75} [WARNING|modeling_utils.py:388] 2022-02-28 11:20:16,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▌ | 1333/1784 [1:43:12<23:15, 3.09s/it]g-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:22,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:22,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1768, 'learning_rate': 3.53582554517134e-06, 'epoch': 0.75} [WARNING|modeling_utils.py:388] 2022-02-28 11:20:22,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:02,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▌ | 1335/1784 [1:43:18<22:30, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:26,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▋ | 1336/1784 [1:43:21<22:07, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:26,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▋ | 1336/1784 [1:43:21<22:07, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:26,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:30,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:26,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:30,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:26,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2565, 'learning_rate': 3.5124610591900315e-06, 'epoch': 0.75} [WARNING|modeling_utils.py:388] 2022-02-28 11:20:30,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:26,614 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▊ | 1338/1784 [1:43:26<21:18, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:34,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▊ | 1339/1784 [1:43:29<20:55, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:34,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▊ | 1339/1784 [1:43:29<20:55, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:34,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:38,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:34,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:20:38,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:34,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2919, 'learning_rate': 3.489096573208723e-06, 'epoch': 0.75} [WARNING|modeling_utils.py:388] 2022-02-28 11:20:38,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:34,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▉ | 1341/1784 [1:43:34<19:55, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:42,713 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▉ | 1341/1784 [1:43:34<19:55, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:42,713 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▉ | 1342/1784 [1:43:37<19:12, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:44,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▉ | 1342/1784 [1:43:37<19:12, 2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:44,987 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▉ | 1343/1784 [1:43:39<18:09, 2.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:47,074 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|█████████████████████████████████████████████████████████▉ | 1343/1784 [1:43:39<18:09, 2.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:47,074 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|██████████████████████████████████████████████████████████ | 1344/1784 [1:43:41<17:05, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:49,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|██████████████████████████████████████████████████████████ | 1344/1784 [1:43:41<17:05, 2.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:49,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|██████████████████████████████████████████████████████████ | 1345/1784 [1:43:43<16:09, 2.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:50,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|██████████████████████████████████████████████████████████ | 1345/1784 [1:43:43<16:09, 2.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:50,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▏ | 1347/1784 [1:43:46<13:49, 1.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:52,480 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▏ | 1347/1784 [1:43:46<13:49, 1.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:52,480 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▏ | 1348/1784 [1:43:47<12:37, 1.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:55,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▏ | 1348/1784 [1:43:47<12:37, 1.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:55,209 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1839, 'learning_rate': 3.426791277258567e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▏ | 1349/1784 [1:43:48<11:31, 1.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:56,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▏ | 1349/1784 [1:43:48<11:31, 1.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:56,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▎ | 1350/1784 [1:43:50<11:44, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:56,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▎ | 1350/1784 [1:43:50<11:44, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▎ | 1350/1784 [1:43:50<11:44, 1.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▎ | 1351/1784 [1:43:54<16:55, 2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:21:05,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:21:05,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2394, 'learning_rate': 3.395638629283489e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▍ | 1353/1784 [1:44:02<22:06, 3.08s/it]g-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▍ | 1353/1784 [1:44:02<22:06, 3.08s/it]g-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1823, 'learning_rate': 3.38785046728972e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▍ | 1354/1784 [1:44:05<23:25, 3.27s/it]g-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▍ | 1354/1784 [1:44:05<23:25, 3.27s/it]g-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2178, 'learning_rate': 3.3800623052959503e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▍ | 1355/1784 [1:44:09<24:16, 3.40s/it]g-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▍ | 1355/1784 [1:44:09<24:16, 3.40s/it]g-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0713, 'learning_rate': 3.372274143302181e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▍ | 1355/1784 [1:44:09<24:16, 3.40s/it]g-point operations will not be computed-28 11:20:59,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▌ | 1356/1784 [1:44:13<24:47, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▌ | 1357/1784 [1:44:16<24:54, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▌ | 1357/1784 [1:44:16<24:54, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.903, 'learning_rate': 3.356697819314642e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▌ | 1358/1784 [1:44:20<25:11, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▌ | 1358/1784 [1:44:20<25:11, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2348, 'learning_rate': 3.348909657320872e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▋ | 1359/1784 [1:44:24<25:11, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▋ | 1359/1784 [1:44:24<25:11, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2705, 'learning_rate': 3.341121495327103e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▋ | 1359/1784 [1:44:24<25:11, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▋ | 1360/1784 [1:44:27<25:06, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:21:38,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:21:38,062 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.111, 'learning_rate': 3.3255451713395643e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▊ | 1362/1784 [1:44:34<25:07, 3.57s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▊ | 1362/1784 [1:44:34<25:07, 3.57s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0084, 'learning_rate': 3.3177570093457945e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▊ | 1363/1784 [1:44:38<24:58, 3.56s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▊ | 1363/1784 [1:44:38<24:58, 3.56s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0295, 'learning_rate': 3.3099688473520254e-06, 'epoch': 0.76} 76%|██████████████████████████████████████████████████████████▊ | 1363/1784 [1:44:38<24:58, 3.56s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|██████████████████████████████████████████████████████████▊ | 1364/1784 [1:44:41<24:53, 3.56s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:21:52,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:21:52,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1019, 'learning_rate': 3.294392523364486e-06, 'epoch': 0.77} 77%|██████████████████████████████████████████████████████████▉ | 1366/1784 [1:44:48<24:19, 3.49s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████▉ | 1366/1784 [1:44:48<24:19, 3.49s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8966, 'learning_rate': 3.2866043613707167e-06, 'epoch': 0.77} 77%|██████████████████████████████████████████████████████████▉ | 1366/1784 [1:44:48<24:19, 3.49s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████ | 1367/1784 [1:44:52<24:07, 3.47s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:02,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:02,418 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9826, 'learning_rate': 3.2710280373831774e-06, 'epoch': 0.77} 77%|███████████████████████████████████████████████████████████ | 1369/1784 [1:44:58<23:41, 3.43s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████ | 1369/1784 [1:44:58<23:41, 3.43s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3527, 'learning_rate': 3.2632398753894084e-06, 'epoch': 0.77} 77%|███████████████████████████████████████████████████████████ | 1369/1784 [1:44:58<23:41, 3.43s/it]g-point operations will not be computed-28 11:21:21,968 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▏ | 1370/1784 [1:45:02<23:38, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▏ | 1371/1784 [1:45:05<23:24, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▏ | 1371/1784 [1:45:05<23:24, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2342, 'learning_rate': 3.2476635514018696e-06, 'epoch': 0.77} 77%|███████████████████████████████████████████████████████████▏ | 1372/1784 [1:45:09<23:17, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▏ | 1372/1784 [1:45:09<23:17, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9893, 'learning_rate': 3.2398753894080997e-06, 'epoch': 0.77} 77%|███████████████████████████████████████████████████████████▏ | 1372/1784 [1:45:09<23:17, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▎ | 1373/1784 [1:45:12<23:12, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:22,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:22,649 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2497, 'learning_rate': 3.224299065420561e-06, 'epoch': 0.77} 77%|███████████████████████████████████████████████████████████▎ | 1375/1784 [1:45:19<22:54, 3.36s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▎ | 1375/1784 [1:45:19<22:54, 3.36s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:29,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:29,262 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9037, 'learning_rate': 3.208722741433022e-06, 'epoch': 0.77} 77%|███████████████████████████████████████████████████████████▍ | 1377/1784 [1:45:25<22:26, 3.31s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▍ | 1377/1784 [1:45:25<22:26, 3.31s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0796, 'learning_rate': 3.2009345794392525e-06, 'epoch': 0.77} 77%|███████████████████████████████████████████████████████████▍ | 1377/1784 [1:45:25<22:26, 3.31s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▍ | 1378/1784 [1:45:28<22:11, 3.28s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:38,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:38,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1428, 'learning_rate': 3.1853582554517137e-06, 'epoch': 0.77} [WARNING|modeling_utils.py:388] 2022-02-28 11:22:38,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▌ | 1380/1784 [1:45:35<21:45, 3.23s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8469, 'learning_rate': 3.169781931464175e-06, 'epoch': 0.77} [WARNING|modeling_utils.py:388] 2022-02-28 11:22:45,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|███████████████████████████████████████████████████████████▋ | 1382/1784 [1:45:41<21:25, 3.20s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:51,561 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:51,561 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0223, 'learning_rate': 3.154205607476636e-06, 'epoch': 0.78} 78%|███████████████████████████████████████████████████████████▋ | 1384/1784 [1:45:47<20:58, 3.15s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████▋ | 1384/1784 [1:45:47<20:58, 3.15s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:57,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:22:57,646 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3609, 'learning_rate': 3.138629283489097e-06, 'epoch': 0.78} 78%|███████████████████████████████████████████████████████████▊ | 1386/1784 [1:45:53<20:13, 3.05s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████▊ | 1386/1784 [1:45:53<20:13, 3.05s/it]g-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:03,455 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:03,455 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.234, 'learning_rate': 3.123052959501558e-06, 'epoch': 0.78} [WARNING|modeling_utils.py:388] 2022-02-28 11:23:03,455 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:22:10,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████▉ | 1388/1784 [1:45:59<19:34, 2.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:23:07,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████▉ | 1389/1784 [1:46:02<19:11, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:23:07,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████▉ | 1389/1784 [1:46:02<19:11, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:23:07,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:11,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:07,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:11,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:07,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1721, 'learning_rate': 3.099688473520249e-06, 'epoch': 0.78} [WARNING|modeling_utils.py:388] 2022-02-28 11:23:11,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:07,791 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|████████████████████████████████████████████████████████████ | 1391/1784 [1:46:07<18:14, 2.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|████████████████████████████████████████████████████████████ | 1392/1784 [1:46:10<17:39, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|████████████████████████████████████████████████████████████ | 1392/1784 [1:46:10<17:39, 2.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:19,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:19,285 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:21,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:21,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:23,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:23,397 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:25,170 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:25,170 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:28,216 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:28,216 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2896, 'learning_rate': 3.045171339563863e-06, 'epoch': 0.78} [WARNING|modeling_utils.py:388] 2022-02-28 11:23:29,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:29,461 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:31,137 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:31,137 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4829, 'learning_rate': 3.0218068535825547e-06, 'epoch': 0.78} [WARNING|modeling_utils.py:388] 2022-02-28 11:23:35,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:35,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0196, 'learning_rate': 3.0140186915887853e-06, 'epoch': 0.79} [WARNING|modeling_utils.py:388] 2022-02-28 11:23:35,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▌ | 1402/1784 [1:46:31<17:35, 2.76s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:42,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:42,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0471, 'learning_rate': 2.9984423676012464e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▌ | 1404/1784 [1:46:39<20:21, 3.22s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▌ | 1404/1784 [1:46:39<20:21, 3.22s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9683, 'learning_rate': 2.9906542056074766e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▋ | 1405/1784 [1:46:42<21:04, 3.34s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▋ | 1405/1784 [1:46:42<21:04, 3.34s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9772, 'learning_rate': 2.9828660436137076e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▋ | 1406/1784 [1:46:46<21:24, 3.40s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▋ | 1406/1784 [1:46:46<21:24, 3.40s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:56,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:23:56,907 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1608, 'learning_rate': 2.9672897196261687e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▊ | 1408/1784 [1:46:53<21:58, 3.51s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▊ | 1408/1784 [1:46:53<21:58, 3.51s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9765, 'learning_rate': 2.959501557632399e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▊ | 1409/1784 [1:46:57<22:01, 3.53s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▊ | 1409/1784 [1:46:57<22:01, 3.53s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.041, 'learning_rate': 2.9517133956386294e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▊ | 1410/1784 [1:47:00<21:54, 3.52s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▊ | 1410/1784 [1:47:00<21:54, 3.52s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:11,069 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:11,069 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9731, 'learning_rate': 2.9361370716510906e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▉ | 1412/1784 [1:47:07<21:47, 3.51s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▉ | 1412/1784 [1:47:07<21:47, 3.51s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0073, 'learning_rate': 2.9283489096573207e-06, 'epoch': 0.79} 79%|████████████████████████████████████████████████████████████▉ | 1413/1784 [1:47:11<21:44, 3.52s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████▉ | 1413/1784 [1:47:11<21:44, 3.52s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:21,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:21,613 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1833, 'learning_rate': 2.912772585669782e-06, 'epoch': 0.79} 79%|█████████████████████████████████████████████████████████████ | 1415/1784 [1:47:18<21:28, 3.49s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|█████████████████████████████████████████████████████████████ | 1415/1784 [1:47:18<21:28, 3.49s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2527, 'learning_rate': 2.904984423676013e-06, 'epoch': 0.79} 79%|█████████████████████████████████████████████████████████████ | 1416/1784 [1:47:21<21:21, 3.48s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|█████████████████████████████████████████████████████████████ | 1416/1784 [1:47:21<21:21, 3.48s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:32,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:32,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1174, 'learning_rate': 2.889408099688474e-06, 'epoch': 0.79} 79%|█████████████████████████████████████████████████████████████▏ | 1418/1784 [1:47:28<21:04, 3.46s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|█████████████████████████████████████████████████████████████▏ | 1418/1784 [1:47:28<21:04, 3.46s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1074, 'learning_rate': 2.881619937694704e-06, 'epoch': 0.79} 80%|█████████████████████████████████████████████████████████████▏ | 1419/1784 [1:47:31<20:53, 3.43s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▏ | 1419/1784 [1:47:31<20:53, 3.43s/it]g-point operations will not be computed-28 11:23:15,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1737, 'learning_rate': 2.8738317757009347e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▎ | 1420/1784 [1:47:35<20:42, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▎ | 1420/1784 [1:47:35<20:42, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▎ | 1421/1784 [1:47:38<20:38, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▎ | 1421/1784 [1:47:38<20:38, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1445, 'learning_rate': 2.858255451713396e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▍ | 1422/1784 [1:47:42<20:23, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▍ | 1422/1784 [1:47:42<20:23, 3.38s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:52,180 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:24:52,180 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1821, 'learning_rate': 2.842679127725857e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▍ | 1424/1784 [1:47:48<19:57, 3.33s/it]g-point operations will not be computed-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▍ | 1424/1784 [1:47:48<19:57, 3.33s/it]g-point operations will not be computed-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0798, 'learning_rate': 2.834890965732087e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▍ | 1424/1784 [1:47:48<19:57, 3.33s/it]g-point operations will not be computed-28 11:24:43,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▌ | 1425/1784 [1:47:51<19:45, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:00,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▌ | 1425/1784 [1:47:51<19:45, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:00,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▌ | 1426/1784 [1:47:55<19:30, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:00,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▌ | 1426/1784 [1:47:55<19:30, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:00,328 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▌ | 1427/1784 [1:47:58<19:21, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▌ | 1427/1784 [1:47:58<19:21, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▋ | 1428/1784 [1:48:01<19:07, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▋ | 1428/1784 [1:48:01<19:07, 3.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0561, 'learning_rate': 2.8037383177570094e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▋ | 1429/1784 [1:48:04<18:55, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▋ | 1429/1784 [1:48:04<18:55, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:14,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:14,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3297, 'learning_rate': 2.7881619937694705e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▊ | 1431/1784 [1:48:10<18:34, 3.16s/it]g-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▊ | 1431/1784 [1:48:10<18:34, 3.16s/it]g-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:20,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:20,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2335, 'learning_rate': 2.7725856697819316e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▊ | 1433/1784 [1:48:16<18:09, 3.11s/it]g-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▊ | 1433/1784 [1:48:16<18:09, 3.11s/it]g-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:26,732 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:26,732 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:06,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9912, 'learning_rate': 2.7570093457943923e-06, 'epoch': 0.8} 80%|█████████████████████████████████████████████████████████████▉ | 1435/1784 [1:48:22<17:32, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:31,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▉ | 1435/1784 [1:48:22<17:32, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:31,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▉ | 1436/1784 [1:48:25<17:10, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:31,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████▉ | 1436/1784 [1:48:25<17:10, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:31,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:35,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:31,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:35,229 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:31,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3391, 'learning_rate': 2.7336448598130845e-06, 'epoch': 0.81} 81%|██████████████████████████████████████████████████████████████ | 1438/1784 [1:48:31<16:27, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████ | 1438/1784 [1:48:31<16:27, 2.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████ | 1439/1784 [1:48:33<15:55, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████ | 1439/1784 [1:48:33<15:55, 2.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:43,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:43,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:45,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:25:45,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1781, 'learning_rate': 2.7024922118380063e-06, 'epoch': 0.81} [WARNING|modeling_utils.py:388] 2022-02-28 11:25:45,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:25:39,333 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▏ | 1442/1784 [1:48:40<14:24, 2.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:48,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▏ | 1442/1784 [1:48:40<14:24, 2.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:48,979 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▎ | 1443/1784 [1:48:43<13:47, 2.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:51,087 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▎ | 1443/1784 [1:48:43<13:47, 2.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:51,087 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▎ | 1444/1784 [1:48:45<13:09, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:53,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▎ | 1444/1784 [1:48:45<13:09, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:53,124 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▎ | 1445/1784 [1:48:47<12:31, 2.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:55,023 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▎ | 1445/1784 [1:48:47<12:31, 2.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:55,023 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▍ | 1446/1784 [1:48:49<11:45, 2.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:56,705 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▍ | 1446/1784 [1:48:49<11:45, 2.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:56,705 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2613, 'learning_rate': 2.6557632398753897e-06, 'epoch': 0.81} 81%|██████████████████████████████████████████████████████████████▍ | 1448/1784 [1:48:52<09:57, 1.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:58,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▍ | 1448/1784 [1:48:52<09:57, 1.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:25:58,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▌ | 1449/1784 [1:48:53<08:59, 1.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:00,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▌ | 1450/1784 [1:48:54<09:06, 1.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:00,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▌ | 1450/1784 [1:48:54<09:06, 1.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:00,688 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▌ | 1450/1784 [1:48:54<09:06, 1.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:03,837 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▋ | 1451/1784 [1:48:58<12:52, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▋ | 1451/1784 [1:48:58<12:52, 2.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▋ | 1452/1784 [1:49:02<15:13, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▋ | 1452/1784 [1:49:02<15:13, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1259, 'learning_rate': 2.616822429906542e-06, 'epoch': 0.81} 81%|██████████████████████████████████████████████████████████████▋ | 1453/1784 [1:49:06<16:45, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|██████████████████████████████████████████████████████████████▋ | 1453/1784 [1:49:06<16:45, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9906, 'learning_rate': 2.6090342679127727e-06, 'epoch': 0.81} 82%|██████████████████████████████████████████████████████████████▊ | 1454/1784 [1:49:09<17:37, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████▊ | 1454/1784 [1:49:09<17:37, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9352, 'learning_rate': 2.6012461059190033e-06, 'epoch': 0.82} 82%|██████████████████████████████████████████████████████████████▊ | 1455/1784 [1:49:13<18:17, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████▊ | 1455/1784 [1:49:13<18:17, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:26:23,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:26:23,978 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0714, 'learning_rate': 2.585669781931464e-06, 'epoch': 0.82} 82%|██████████████████████████████████████████████████████████████▉ | 1457/1784 [1:49:20<18:47, 3.45s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████▉ | 1457/1784 [1:49:20<18:47, 3.45s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3342, 'learning_rate': 2.577881619937695e-06, 'epoch': 0.82} 82%|██████████████████████████████████████████████████████████████▉ | 1458/1784 [1:49:24<18:49, 3.47s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████▉ | 1458/1784 [1:49:24<18:49, 3.47s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1305, 'learning_rate': 2.570093457943925e-06, 'epoch': 0.82} 82%|██████████████████████████████████████████████████████████████▉ | 1459/1784 [1:49:27<19:00, 3.51s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████▉ | 1459/1784 [1:49:27<19:00, 3.51s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:26:38,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:26:38,208 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1794, 'learning_rate': 2.5545171339563862e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████ | 1461/1784 [1:49:34<19:03, 3.54s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████ | 1461/1784 [1:49:34<19:03, 3.54s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0886, 'learning_rate': 2.5467289719626172e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████ | 1462/1784 [1:49:38<19:01, 3.55s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████ | 1462/1784 [1:49:38<19:01, 3.55s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2048, 'learning_rate': 2.5389408099688474e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████ | 1462/1784 [1:49:38<19:01, 3.55s/it]g-point operations will not be computed-28 11:26:07,606 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▏ | 1463/1784 [1:49:41<18:54, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▏ | 1464/1784 [1:49:45<18:48, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▏ | 1464/1784 [1:49:45<18:48, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.126, 'learning_rate': 2.5233644859813085e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████▏ | 1465/1784 [1:49:48<18:33, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▏ | 1465/1784 [1:49:48<18:33, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3393, 'learning_rate': 2.515576323987539e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████▎ | 1466/1784 [1:49:52<18:25, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▎ | 1466/1784 [1:49:52<18:25, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:02,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:02,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1, 'learning_rate': 2.5e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████▎ | 1468/1784 [1:49:59<18:09, 3.45s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▎ | 1468/1784 [1:49:59<18:09, 3.45s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1728, 'learning_rate': 2.4922118380062308e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████▍ | 1469/1784 [1:50:02<18:04, 3.44s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▍ | 1469/1784 [1:50:02<18:04, 3.44s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:12,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:12,915 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9488, 'learning_rate': 2.476635514018692e-06, 'epoch': 0.82} 82%|███████████████████████████████████████████████████████████████▍ | 1471/1784 [1:50:09<17:49, 3.42s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|███████████████████████████████████████████████████████████████▍ | 1471/1784 [1:50:09<17:49, 3.42s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0971, 'learning_rate': 2.4688473520249225e-06, 'epoch': 0.82} 83%|███████████████████████████████████████████████████████████████▌ | 1472/1784 [1:50:12<17:36, 3.39s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▌ | 1472/1784 [1:50:12<17:36, 3.39s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:22,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:22,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0871, 'learning_rate': 2.453271028037383e-06, 'epoch': 0.83} 83%|███████████████████████████████████████████████████████████████▌ | 1474/1784 [1:50:19<17:24, 3.37s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▌ | 1474/1784 [1:50:19<17:24, 3.37s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8353, 'learning_rate': 2.4454828660436138e-06, 'epoch': 0.83} 83%|███████████████████████████████████████████████████████████████▋ | 1475/1784 [1:50:22<17:14, 3.35s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▋ | 1475/1784 [1:50:22<17:14, 3.35s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:32,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:32,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1325, 'learning_rate': 2.429906542056075e-06, 'epoch': 0.83} 83%|███████████████████████████████████████████████████████████████▋ | 1477/1784 [1:50:29<16:51, 3.29s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▋ | 1477/1784 [1:50:29<16:51, 3.29s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:39,380 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:27:39,380 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.871, 'learning_rate': 2.414330218068536e-06, 'epoch': 0.83} 83%|███████████████████████████████████████████████████████████████▊ | 1479/1784 [1:50:35<16:31, 3.25s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▊ | 1479/1784 [1:50:35<16:31, 3.25s/it]g-point operations will not be computed-28 11:26:50,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1769, 'learning_rate': 2.4065420560747666e-06, 'epoch': 0.83} 83%|███████████████████████████████████████████████████████████████▉ | 1480/1784 [1:50:38<16:27, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:47,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▉ | 1480/1784 [1:50:38<16:27, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:47,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▉ | 1481/1784 [1:50:42<16:18, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:47,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▉ | 1481/1784 [1:50:42<16:18, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:47,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0905, 'learning_rate': 2.3909657320872277e-06, 'epoch': 0.83} 83%|███████████████████████████████████████████████████████████████▉ | 1481/1784 [1:50:42<16:18, 3.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:47,453 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▉ | 1482/1784 [1:50:45<16:08, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████▉ | 1482/1784 [1:50:45<16:08, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████ | 1483/1784 [1:50:48<15:57, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████ | 1483/1784 [1:50:48<15:57, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████ | 1483/1784 [1:50:48<15:57, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████ | 1484/1784 [1:50:51<15:41, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████ | 1484/1784 [1:50:51<15:41, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:01,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:01,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:01,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:27:53,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████▏ | 1486/1784 [1:50:57<15:00, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:05,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████▏ | 1486/1784 [1:50:57<15:00, 3.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:05,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████▏ | 1487/1784 [1:51:00<14:37, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:05,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████▏ | 1487/1784 [1:51:00<14:37, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:05,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:09,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:05,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:09,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:05,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:09,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:05,581 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████▎ | 1489/1784 [1:51:05<13:53, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|████████████████████████████████████████████████████████████████▎ | 1489/1784 [1:51:05<13:53, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████▎ | 1490/1784 [1:51:08<13:31, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████▎ | 1490/1784 [1:51:08<13:31, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:17,513 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:17,513 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:19,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:19,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:22,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:22,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:24,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:24,521 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:26,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:26,514 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:28,322 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:28,322 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:29,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:29,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.52, 'learning_rate': 2.2585669781931465e-06, 'epoch': 0.84} [WARNING|modeling_utils.py:388] 2022-02-28 11:28:32,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:28:32,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2369] 2022-02-28 11:28:34,563 >> Batch size = 8aluation *****e number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%| | 0/331 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 2/331 [00:02<06:49, 1.24s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 3/331 [00:04<09:13, 1.69s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 4/331 [00:07<10:37, 1.95s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 5/331 [00:09<12:11, 2.24s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▌ | 6/331 [00:12<13:10, 2.43s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▊ | 7/331 [00:15<13:22, 2.48s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|██ | 8/331 [00:18<13:44, 2.55s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▎ | 9/331 [00:21<14:22, 2.68s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 10/331 [00:24<15:07, 2.83s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 11/331 [00:26<14:45, 2.77s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 12/331 [00:29<14:33, 2.74s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 13/331 [00:32<14:10, 2.67s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▍ | 14/331 [00:34<13:58, 2.65s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▋ | 15/331 [00:38<15:17, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▉ | 16/331 [00:41<16:16, 3.10s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▏ | 17/331 [00:44<16:24, 3.14s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▍ | 18/331 [00:47<15:01, 2.88s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▋ | 19/331 [00:49<14:37, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▉ | 20/331 [00:52<13:40, 2.64s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████▏ | 21/331 [00:55<14:19, 2.77s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 22/331 [00:58<15:24, 2.99s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 23/331 [01:02<16:40, 3.25s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 24/331 [01:06<17:41, 3.46s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 25/331 [01:09<16:59, 3.33s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 26/331 [01:12<15:51, 3.12s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 27/331 [01:15<15:54, 3.14s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▉ | 28/331 [01:18<15:20, 3.04s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▏ | 29/331 [01:20<15:04, 2.99s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▍ | 30/331 [01:23<14:23, 2.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▋ | 31/331 [01:26<13:44, 2.75s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▉ | 32/331 [01:28<13:25, 2.69s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▏ | 33/331 [01:31<13:35, 2.74s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▍ | 34/331 [01:34<13:31, 2.73s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▋ | 35/331 [01:36<13:35, 2.76s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 36/331 [01:40<14:10, 2.88s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 37/331 [01:43<14:53, 3.04s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▍ | 38/331 [01:46<15:06, 3.09s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 39/331 [01:49<15:13, 3.13s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▉ | 40/331 [01:52<14:04, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|██████████▏ | 41/331 [01:54<13:30, 2.80s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▍ | 42/331 [01:58<14:16, 2.96s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▋ | 43/331 [02:01<15:03, 3.14s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▉ | 44/331 [02:05<15:21, 3.21s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▏ | 45/331 [02:07<14:32, 3.05s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▍ | 46/331 [02:10<13:35, 2.86s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▋ | 47/331 [02:12<12:51, 2.72s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|███████████▉ | 48/331 [02:15<13:13, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▏ | 49/331 [02:18<13:45, 2.93s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▍ | 50/331 [02:21<13:39, 2.92s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▋ | 51/331 [02:24<14:03, 3.01s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 52/331 [02:27<13:33, 2.92s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▏ | 53/331 [02:30<13:35, 2.93s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▍ | 54/331 [02:33<13:00, 2.82s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▋ | 55/331 [02:36<13:59, 3.04s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 56/331 [02:39<13:41, 2.99s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 57/331 [02:42<13:05, 2.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 58/331 [02:45<13:27, 2.96s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 59/331 [02:47<12:36, 2.78s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▊ | 60/331 [02:50<12:18, 2.72s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|███████████████ | 61/331 [02:53<12:40, 2.82s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 62/331 [02:56<12:29, 2.79s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▌ | 63/331 [02:59<13:45, 3.08s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▊ | 64/331 [03:02<13:18, 2.99s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████ | 65/331 [03:05<13:09, 2.97s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▎ | 66/331 [03:09<14:17, 3.24s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▌ | 67/331 [03:13<14:52, 3.38s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▊ | 68/331 [03:16<14:56, 3.41s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████ | 69/331 [03:19<14:31, 3.33s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████▎ | 70/331 [03:22<14:13, 3.27s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|█████████████████▌ | 71/331 [03:26<14:22, 3.32s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 72/331 [03:29<14:23, 3.33s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|██████████████████ | 73/331 [03:32<13:59, 3.25s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|██████████████████▎ | 74/331 [03:35<13:36, 3.18s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 75/331 [03:39<13:47, 3.23s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▊ | 76/331 [03:41<13:02, 3.07s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|███████████████████ | 77/331 [03:44<12:47, 3.02s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▎ | 78/331 [03:47<12:21, 2.93s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▌ | 79/331 [03:50<12:00, 2.86s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▊ | 80/331 [03:52<11:48, 2.82s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|████████████████████ | 81/331 [03:56<12:12, 2.93s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▎ | 82/331 [03:58<11:59, 2.89s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▌ | 83/331 [04:02<12:26, 3.01s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▊ | 84/331 [04:05<13:09, 3.20s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 85/331 [04:08<12:17, 3.00s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▎ | 86/331 [04:11<12:49, 3.14s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████▌ | 87/331 [04:14<12:30, 3.07s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▊ | 88/331 [04:17<12:12, 3.01s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████ | 89/331 [04:19<11:20, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████▎ | 90/331 [04:22<10:54, 2.72s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|██████████████████████▌ | 91/331 [04:25<11:20, 2.84s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▊ | 92/331 [04:27<10:34, 2.65s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████ | 93/331 [04:30<10:41, 2.70s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▎ | 94/331 [04:33<11:02, 2.79s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▌ | 95/331 [04:36<11:10, 2.84s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▊ | 96/331 [04:39<11:14, 2.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 97/331 [04:41<10:42, 2.75s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 98/331 [04:45<11:05, 2.86s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 99/331 [04:47<10:58, 2.84s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▍ | 100/331 [04:50<10:34, 2.75s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▋ | 101/331 [04:53<10:28, 2.73s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|████████████████████████▉ | 102/331 [04:56<11:18, 2.96s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▏ | 103/331 [04:59<10:45, 2.83s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▍ | 104/331 [05:01<10:38, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▋ | 105/331 [05:04<10:38, 2.83s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 106/331 [05:07<10:39, 2.84s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|██████████████████████████▏ | 107/331 [05:09<09:57, 2.67s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▍ | 108/331 [05:12<09:47, 2.63s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▋ | 109/331 [05:14<09:40, 2.62s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 33%|██████████████████████████▉ | 110/331 [05:17<10:05, 2.74s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▏ | 111/331 [05:20<10:16, 2.80s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▍ | 112/331 [05:23<10:15, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▋ | 113/331 [05:26<09:52, 2.72s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▉ | 114/331 [05:29<09:58, 2.76s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▏ | 115/331 [05:31<10:00, 2.78s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▍ | 116/331 [05:35<10:20, 2.89s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 117/331 [05:37<10:13, 2.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 118/331 [05:40<09:59, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████ | 119/331 [05:43<09:56, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|█████████████████████████████▎ | 120/331 [05:46<09:51, 2.80s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 121/331 [05:49<10:22, 2.96s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▊ | 122/331 [05:52<10:11, 2.92s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████ | 123/331 [05:55<10:49, 3.12s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|██████████████████████████████▎ | 124/331 [05:58<10:40, 3.09s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 125/331 [06:02<11:07, 3.24s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 126/331 [06:05<11:12, 3.28s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 127/331 [06:09<11:36, 3.41s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 128/331 [06:13<11:32, 3.41s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▌ | 129/331 [06:16<11:15, 3.34s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▊ | 130/331 [06:19<11:27, 3.42s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████ | 131/331 [06:23<11:39, 3.50s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▎ | 132/331 [06:26<10:58, 3.31s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▌ | 133/331 [06:29<10:14, 3.10s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▊ | 134/331 [06:31<09:55, 3.02s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 135/331 [06:35<10:04, 3.08s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▎ | 136/331 [06:38<10:20, 3.18s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 137/331 [06:42<10:40, 3.30s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 137/331 [06:42<10:40, 3.30s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 137/331 [06:42<10:40, 3.30s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 139/331 [06:47<09:36, 3.00s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 140/331 [06:51<10:14, 3.22s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▌ | 141/331 [06:54<09:50, 3.11s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 142/331 [06:57<09:36, 3.05s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 143/331 [07:00<09:58, 3.19s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▏ | 144/331 [07:03<09:37, 3.09s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 145/331 [07:06<09:28, 3.06s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 146/331 [07:10<09:56, 3.23s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▉ | 147/331 [07:13<09:37, 3.14s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▏ | 148/331 [07:15<09:03, 2.97s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▍ | 149/331 [07:18<08:32, 2.82s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▋ | 150/331 [07:21<08:54, 2.95s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|████████████████████████████████████▉ | 151/331 [07:24<08:45, 2.92s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▏ | 152/331 [07:26<08:21, 2.80s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▍ | 153/331 [07:29<08:13, 2.77s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▋ | 154/331 [07:32<08:33, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 155/331 [07:36<08:53, 3.03s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 156/331 [07:39<09:09, 3.14s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 157/331 [07:43<09:28, 3.27s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 158/331 [07:46<09:31, 3.31s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▉ | 159/331 [07:49<09:31, 3.32s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▏ | 160/331 [07:52<09:01, 3.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▍ | 161/331 [07:55<08:45, 3.09s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▋ | 162/331 [07:59<09:09, 3.25s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▉ | 163/331 [08:02<09:19, 3.33s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▏ | 164/331 [08:05<08:47, 3.16s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▍ | 165/331 [08:08<08:33, 3.10s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▌ | 166/331 [08:11<08:14, 3.00s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 50%|████████████████████████████████████████▊ | 167/331 [08:14<08:24, 3.07s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████ | 168/331 [08:16<07:55, 2.92s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▎ | 169/331 [08:20<08:09, 3.02s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▌ | 170/331 [08:22<07:40, 2.86s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▊ | 171/331 [08:25<07:32, 2.83s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 172/331 [08:27<07:14, 2.73s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 173/331 [08:31<07:30, 2.85s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 174/331 [08:33<07:16, 2.78s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▊ | 175/331 [08:36<07:24, 2.85s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 176/331 [08:39<07:05, 2.75s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████▎ | 177/331 [08:42<07:29, 2.92s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 178/331 [08:46<07:56, 3.11s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 178/331 [08:46<07:56, 3.11s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▌ | 178/331 [08:46<07:56, 3.11s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|████████████████████████████████████████████ | 180/331 [08:52<08:03, 3.20s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▎ | 181/331 [08:55<07:53, 3.16s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▌ | 182/331 [08:58<07:15, 2.92s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▊ | 183/331 [09:00<06:45, 2.74s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 184/331 [09:02<06:24, 2.62s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 185/331 [09:04<05:59, 2.46s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 186/331 [09:07<06:07, 2.53s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▊ | 187/331 [09:10<06:35, 2.75s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 188/331 [09:13<06:35, 2.76s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 189/331 [09:16<06:19, 2.67s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▍ | 190/331 [09:18<06:02, 2.57s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 191/331 [09:20<05:55, 2.54s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 192/331 [09:23<05:49, 2.51s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 193/331 [09:26<06:17, 2.74s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 194/331 [09:28<05:59, 2.62s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 195/331 [09:31<05:52, 2.59s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▉ | 196/331 [09:34<05:55, 2.63s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▏ | 197/331 [09:37<06:11, 2.77s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▍ | 198/331 [09:39<05:51, 2.65s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▋ | 199/331 [09:42<05:53, 2.68s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▉ | 200/331 [09:44<05:36, 2.57s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▏ | 201/331 [09:47<05:31, 2.55s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▍ | 202/331 [09:50<05:37, 2.62s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▋ | 203/331 [09:52<05:40, 2.66s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 204/331 [09:56<05:59, 2.83s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 205/331 [09:59<06:05, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 206/331 [10:01<06:03, 2.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 207/331 [10:05<06:21, 3.07s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▉ | 208/331 [10:08<06:25, 3.13s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▏ | 209/331 [10:11<05:52, 2.89s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▍ | 210/331 [10:13<05:26, 2.70s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▋ | 211/331 [10:16<05:31, 2.76s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▉ | 212/331 [10:18<05:18, 2.68s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 213/331 [10:21<05:18, 2.70s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 214/331 [10:23<04:58, 2.55s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▌ | 215/331 [10:25<04:43, 2.44s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▊ | 216/331 [10:29<05:10, 2.70s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████ | 217/331 [10:31<05:10, 2.72s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▎ | 218/331 [10:35<05:24, 2.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▌ | 219/331 [10:37<05:18, 2.84s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▊ | 220/331 [10:40<05:04, 2.74s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 221/331 [10:43<05:08, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▎ | 222/331 [10:45<04:55, 2.71s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 223/331 [10:48<04:59, 2.78s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 224/331 [10:51<04:59, 2.80s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 225/331 [10:54<04:54, 2.78s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 226/331 [10:57<05:04, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▌ | 227/331 [11:00<04:57, 2.86s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▊ | 228/331 [11:03<04:51, 2.83s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████ | 229/331 [11:05<04:48, 2.83s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▎ | 230/331 [11:08<04:40, 2.78s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▌ | 231/331 [11:11<04:44, 2.85s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▊ | 232/331 [11:14<04:41, 2.84s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 233/331 [11:17<04:46, 2.93s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 234/331 [11:20<04:32, 2.80s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 235/331 [11:22<04:20, 2.71s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 236/331 [11:26<04:48, 3.04s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|█████████████████████████████████████████████████████████▉ | 237/331 [11:29<04:57, 3.16s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▏ | 238/331 [11:32<04:52, 3.15s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▍ | 239/331 [11:36<04:52, 3.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▋ | 240/331 [11:39<04:54, 3.24s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 241/331 [11:43<04:57, 3.31s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 242/331 [11:46<04:55, 3.32s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▍ | 243/331 [11:49<04:53, 3.34s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 244/331 [11:53<04:58, 3.43s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 244/331 [11:53<04:58, 3.43s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|███████████████████████████████████████████████████████████▋ | 244/331 [11:53<04:58, 3.43s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|████████████████████████████████████████████████████████████▏ | 246/331 [12:00<04:59, 3.52s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▍ | 247/331 [12:03<04:45, 3.40s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▋ | 248/331 [12:06<04:23, 3.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|████████████████████████████████████████████████████████████▉ | 249/331 [12:08<04:03, 2.97s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▏ | 250/331 [12:11<03:51, 2.86s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▍ | 251/331 [12:14<03:52, 2.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▋ | 252/331 [12:16<03:38, 2.77s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|█████████████████████████████████████████████████████████████▉ | 253/331 [12:20<03:46, 2.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▏ | 254/331 [12:22<03:39, 2.86s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▍ | 255/331 [12:26<03:46, 2.98s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|██████████████████████████████████████████████████████████████▋ | 256/331 [12:28<03:36, 2.89s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|██████████████████████████████████████████████████████████████▉ | 257/331 [12:31<03:42, 3.00s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████▏ | 258/331 [12:34<03:27, 2.85s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|███████████████████████████████████████████████████████████████▍ | 259/331 [12:37<03:22, 2.81s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|███████████████████████████████████████████████████████████████▋ | 260/331 [12:40<03:25, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|███████████████████████████████████████████████████████████████▊ | 261/331 [12:42<03:11, 2.74s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████████ | 262/331 [12:45<03:10, 2.76s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|████████████████████████████████████████████████████████████████▎ | 263/331 [12:48<03:15, 2.88s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▌ | 264/331 [12:51<03:06, 2.78s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|████████████████████████████████████████████████████████████████▊ | 265/331 [12:53<02:59, 2.71s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|█████████████████████████████████████████████████████████████████ | 266/331 [12:56<02:54, 2.69s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▎ | 267/331 [12:59<03:04, 2.89s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▌ | 268/331 [13:02<03:02, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|█████████████████████████████████████████████████████████████████▊ | 269/331 [13:06<03:10, 3.07s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████ | 270/331 [13:09<03:05, 3.05s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▎ | 271/331 [13:12<03:09, 3.15s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▌ | 272/331 [13:15<03:00, 3.06s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|██████████████████████████████████████████████████████████████████▊ | 273/331 [13:18<02:58, 3.08s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████ | 274/331 [13:21<03:02, 3.20s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████▎ | 275/331 [13:25<03:01, 3.24s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|███████████████████████████████████████████████████████████████████▌ | 276/331 [13:27<02:48, 3.06s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|███████████████████████████████████████████████████████████████████▊ | 277/331 [13:30<02:41, 2.99s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████████ | 278/331 [13:33<02:36, 2.95s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|████████████████████████████████████████████████████████████████████▎ | 279/331 [13:37<02:44, 3.16s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▌ | 280/331 [13:40<02:37, 3.09s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|████████████████████████████████████████████████████████████████████▊ | 281/331 [13:43<02:38, 3.18s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████████ | 282/331 [13:46<02:36, 3.19s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████████▎ | 283/331 [13:50<02:36, 3.26s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▍ | 284/331 [13:53<02:37, 3.35s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▋ | 285/331 [13:57<02:37, 3.41s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████████▉ | 286/331 [14:00<02:35, 3.45s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▏ | 287/331 [14:04<02:36, 3.57s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▍ | 288/331 [14:08<02:32, 3.54s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████████▋ | 289/331 [14:11<02:20, 3.35s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|██████████████████████████████████████████████████████████████████████▉ | 290/331 [14:13<02:09, 3.16s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████▏ | 291/331 [14:16<01:59, 2.98s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████████▍ | 292/331 [14:19<01:52, 2.89s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|███████████████████████████████████████████████████████████████████████▋ | 293/331 [14:21<01:49, 2.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|███████████████████████████████████████████████████████████████████████▉ | 294/331 [14:24<01:40, 2.72s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████████▏ | 295/331 [14:26<01:36, 2.69s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████████▍ | 296/331 [14:29<01:32, 2.63s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▋ | 297/331 [14:32<01:39, 2.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████████▉ | 298/331 [14:36<01:44, 3.15s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████████▏ | 299/331 [14:39<01:37, 3.04s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▍ | 300/331 [14:42<01:34, 3.05s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▋ | 301/331 [14:45<01:29, 2.98s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████████▉ | 302/331 [14:48<01:24, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▏ | 303/331 [14:50<01:19, 2.82s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▍ | 304/331 [14:53<01:18, 2.90s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▋ | 305/331 [14:57<01:18, 3.01s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████████▉ | 306/331 [15:00<01:19, 3.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▏ | 307/331 [15:04<01:19, 3.31s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▎ | 308/331 [15:08<01:20, 3.51s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████████▌ | 309/331 [15:11<01:18, 3.56s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|███████████████████████████████████████████████████████████████████████████▊ | 310/331 [15:14<01:09, 3.33s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████████ | 311/331 [15:17<01:06, 3.32s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████████▎ | 312/331 [15:20<00:58, 3.10s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▌ | 313/331 [15:23<00:54, 3.02s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████████▊ | 314/331 [15:26<00:52, 3.06s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████████ | 315/331 [15:29<00:50, 3.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████████▎ | 316/331 [15:33<00:47, 3.18s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████████▌ | 317/331 [15:36<00:46, 3.34s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████████▊ | 318/331 [15:39<00:40, 3.13s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████████ | 319/331 [15:42<00:35, 2.98s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▎ | 320/331 [15:45<00:33, 3.02s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▌ | 321/331 [15:48<00:29, 2.98s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████████▊ | 322/331 [15:51<00:28, 3.15s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████ | 323/331 [15:54<00:24, 3.05s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▎ | 324/331 [15:57<00:22, 3.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▌ | 325/331 [16:01<00:19, 3.21s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████████▊ | 326/331 [16:04<00:16, 3.25s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████ | 327/331 [16:07<00:12, 3.25s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████▎| 328/331 [16:11<00:09, 3.28s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████████▌| 329/331 [16:14<00:06, 3.19s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████████▊| 330/331 [16:17<00:03, 3.38s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|configuration_utils.py:438] 2022-02-28 11:44:57,222 >> Configuration saved in ./checkpoint-1500/config.json g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|configuration_utils.py:438] 2022-02-28 11:44:57,222 >> Configuration saved in ./checkpoint-1500/config.json g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 02/28/2022 11:44:57 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow [INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-02-28 11:45:02,510 >> Configuration saved in ./checkpoint-1500/preprocessor_config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|█████████████████████████████████████████████████████████████▍ | 1501/1784 [2:10:02<26:22:32, 335.52s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|█████████████████████████████████████████████████████████████▍ | 1501/1784 [2:10:02<26:22:32, 335.52s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2198, 'learning_rate': 2.2352024922118382e-06, 'epoch': 0.84} 84%|█████████████████████████████████████████████████████████████▍ | 1502/1784 [2:10:05<18:29:21, 236.03s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|█████████████████████████████████████████████████████████████▍ | 1502/1784 [2:10:05<18:29:21, 236.03s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1961, 'learning_rate': 2.2274143302180688e-06, 'epoch': 0.84} 84%|█████████████████████████████████████████████████████████████▌ | 1503/1784 [2:10:09<12:59:12, 166.38s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|█████████████████████████████████████████████████████████████▌ | 1503/1784 [2:10:09<12:59:12, 166.38s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1441, 'learning_rate': 2.2196261682242994e-06, 'epoch': 0.84} 84%|██████████████████████████████████████████████████████████████▍ | 1504/1784 [2:10:13<9:08:51, 117.61s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████▍ | 1504/1784 [2:10:13<9:08:51, 117.61s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9946, 'learning_rate': 2.21183800623053e-06, 'epoch': 0.84} 84%|███████████████████████████████████████████████████████████████▎ | 1505/1784 [2:10:17<6:28:12, 83.49s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|███████████████████████████████████████████████████████████████▎ | 1505/1784 [2:10:17<6:28:12, 83.49s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.241, 'learning_rate': 2.20404984423676e-06, 'epoch': 0.84} 84%|███████████████████████████████████████████████████████████████▎ | 1506/1784 [2:10:21<4:35:59, 59.57s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|███████████████████████████████████████████████████████████████▎ | 1506/1784 [2:10:21<4:35:59, 59.57s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1064, 'learning_rate': 2.1962616822429906e-06, 'epoch': 0.84} 84%|███████████████████████████████████████████████████████████████▎ | 1507/1784 [2:10:24<3:17:41, 42.82s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|███████████████████████████████████████████████████████████████▎ | 1507/1784 [2:10:24<3:17:41, 42.82s/it]config.jsonrations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:47:35,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:47:35,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1023, 'learning_rate': 2.1806853582554518e-06, 'epoch': 0.85} 85%|███████████████████████████████████████████████████████████████▍ | 1509/1784 [2:10:32<1:44:58, 22.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████▍ | 1509/1784 [2:10:32<1:44:58, 22.91s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1471, 'learning_rate': 2.1728971962616823e-06, 'epoch': 0.85} 85%|███████████████████████████████████████████████████████████████▍ | 1510/1784 [2:10:36<1:18:29, 17.19s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████▍ | 1510/1784 [2:10:36<1:18:29, 17.19s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1109, 'learning_rate': 2.165109034267913e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▏ | 1511/1784 [2:10:40<59:54, 13.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▏ | 1511/1784 [2:10:40<59:54, 13.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.933, 'learning_rate': 2.1573208722741435e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▎ | 1512/1784 [2:10:43<46:37, 10.29s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▎ | 1512/1784 [2:10:43<46:37, 10.29s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2576, 'learning_rate': 2.149532710280374e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▎ | 1512/1784 [2:10:43<46:37, 10.29s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▎ | 1513/1784 [2:10:47<37:20, 8.27s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▎ | 1513/1784 [2:10:47<37:20, 8.27s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:47:57,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:47:57,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:47:57,665 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▍ | 1515/1784 [2:10:54<26:19, 5.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▍ | 1515/1784 [2:10:54<26:19, 5.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▍ | 1515/1784 [2:10:54<26:19, 5.87s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▍ | 1516/1784 [2:10:57<23:06, 5.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▍ | 1516/1784 [2:10:57<23:06, 5.17s/it]g-point operations will not be computed-28 11:28:13,716 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▍ | 1517/1784 [2:11:01<20:54, 4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▍ | 1517/1784 [2:11:01<20:54, 4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▌ | 1518/1784 [2:11:05<19:15, 4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▌ | 1518/1784 [2:11:05<19:15, 4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2241, 'learning_rate': 2.102803738317757e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▌ | 1519/1784 [2:11:08<18:07, 4.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▌ | 1519/1784 [2:11:08<18:07, 4.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3095, 'learning_rate': 2.0950155763239876e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▌ | 1520/1784 [2:11:12<17:28, 3.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▌ | 1520/1784 [2:11:12<17:28, 3.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:22,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:22,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0323, 'learning_rate': 2.0794392523364487e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▋ | 1522/1784 [2:11:19<16:04, 3.68s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▋ | 1522/1784 [2:11:19<16:04, 3.68s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0204, 'learning_rate': 2.0716510903426793e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▋ | 1523/1784 [2:11:22<15:37, 3.59s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▋ | 1523/1784 [2:11:22<15:37, 3.59s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:32,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:32,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0068, 'learning_rate': 2.0560747663551404e-06, 'epoch': 0.85} 85%|█████████████████████████████████████████████████████████████████▊ | 1525/1784 [2:11:29<15:09, 3.51s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|█████████████████████████████████████████████████████████████████▊ | 1525/1784 [2:11:29<15:09, 3.51s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1383, 'learning_rate': 2.048286604361371e-06, 'epoch': 0.85} 86%|█████████████████████████████████████████████████████████████████▊ | 1526/1784 [2:11:32<14:54, 3.47s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████▊ | 1526/1784 [2:11:32<14:54, 3.47s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:42,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:42,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0699, 'learning_rate': 2.0327102803738317e-06, 'epoch': 0.86} 86%|█████████████████████████████████████████████████████████████████▉ | 1528/1784 [2:11:39<14:33, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████▉ | 1528/1784 [2:11:39<14:33, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1097, 'learning_rate': 2.0249221183800623e-06, 'epoch': 0.86} 86%|█████████████████████████████████████████████████████████████████▉ | 1529/1784 [2:11:42<14:30, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|█████████████████████████████████████████████████████████████████▉ | 1529/1784 [2:11:42<14:30, 3.41s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:52,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:48:52,957 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9695, 'learning_rate': 2.0093457943925234e-06, 'epoch': 0.86} 86%|██████████████████████████████████████████████████████████████████ | 1531/1784 [2:11:49<14:06, 3.35s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████ | 1531/1784 [2:11:49<14:06, 3.35s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9809, 'learning_rate': 2.001557632398754e-06, 'epoch': 0.86} 86%|██████████████████████████████████████████████████████████████████ | 1531/1784 [2:11:49<14:06, 3.35s/it]g-point operations will not be computed-28 11:48:10,154 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████ | 1532/1784 [2:11:52<13:51, 3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:01,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▏ | 1533/1784 [2:11:55<13:36, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:01,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▏ | 1533/1784 [2:11:55<13:36, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:01,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9909, 'learning_rate': 1.985981308411215e-06, 'epoch': 0.86} 86%|██████████████████████████████████████████████████████████████████▏ | 1534/1784 [2:11:58<13:20, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▏ | 1534/1784 [2:11:58<13:20, 3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▎ | 1535/1784 [2:12:01<13:04, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▎ | 1535/1784 [2:12:01<13:04, 3.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:07,188 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0765, 'learning_rate': 1.9704049844236762e-06, 'epoch': 0.86} 86%|██████████████████████████████████████████████████████████████████▎ | 1536/1784 [2:12:04<12:49, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▎ | 1536/1784 [2:12:04<12:49, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▎ | 1537/1784 [2:12:07<12:33, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▎ | 1537/1784 [2:12:07<12:33, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.982, 'learning_rate': 1.9548286604361374e-06, 'epoch': 0.86} 86%|██████████████████████████████████████████████████████████████████▎ | 1537/1784 [2:12:07<12:33, 3.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:13,158 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▍ | 1538/1784 [2:12:10<12:19, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▍ | 1539/1784 [2:12:13<12:03, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▍ | 1539/1784 [2:12:13<12:03, 2.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:23,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:23,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1731, 'learning_rate': 1.9314641744548286e-06, 'epoch': 0.86} [WARNING|modeling_utils.py:388] 2022-02-28 11:49:23,035 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:18,949 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▌ | 1541/1784 [2:12:18<11:21, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:27,034 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▌ | 1542/1784 [2:12:21<11:05, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▌ | 1542/1784 [2:12:21<11:05, 2.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▌ | 1543/1784 [2:12:23<10:42, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|██████████████████████████████████████████████████████████████████▌ | 1543/1784 [2:12:23<10:42, 2.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:33,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:33,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:34,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:34,981 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:36,865 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:36,865 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:38,503 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:38,503 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3333, 'learning_rate': 1.8769470404984424e-06, 'epoch': 0.87} [WARNING|modeling_utils.py:388] 2022-02-28 11:49:41,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:41,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:43,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:43,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0159, 'learning_rate': 1.8535825545171341e-06, 'epoch': 0.87} [WARNING|modeling_utils.py:388] 2022-02-28 11:49:47,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:49:47,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2975, 'learning_rate': 1.8457943925233645e-06, 'epoch': 0.87} 87%|██████████████████████████████████████████████████████████████████▉ | 1552/1784 [2:12:44<11:12, 2.90s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|██████████████████████████████████████████████████████████████████▉ | 1552/1784 [2:12:44<11:12, 2.90s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1462, 'learning_rate': 1.838006230529595e-06, 'epoch': 0.87} 87%|███████████████████████████████████████████████████████████████████ | 1553/1784 [2:12:48<12:13, 3.18s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████ | 1553/1784 [2:12:48<12:13, 3.18s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0861, 'learning_rate': 1.8302180685358256e-06, 'epoch': 0.87} 87%|███████████████████████████████████████████████████████████████████ | 1554/1784 [2:12:52<12:49, 3.35s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████ | 1554/1784 [2:12:52<12:49, 3.35s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9723, 'learning_rate': 1.8224299065420562e-06, 'epoch': 0.87} 87%|███████████████████████████████████████████████████████████████████ | 1555/1784 [2:12:55<13:09, 3.45s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████ | 1555/1784 [2:12:55<13:09, 3.45s/it]g-point operations will not be computed-28 11:49:29,597 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2419, 'learning_rate': 1.8146417445482867e-06, 'epoch': 0.87} 87%|███████████████████████████████████████████████████████████████████▏ | 1556/1784 [2:12:59<13:23, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████▏ | 1556/1784 [2:12:59<13:23, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████▏ | 1557/1784 [2:13:03<13:25, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████▏ | 1557/1784 [2:13:03<13:25, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.223, 'learning_rate': 1.7990654205607477e-06, 'epoch': 0.87} 87%|███████████████████████████████████████████████████████████████████▏ | 1558/1784 [2:13:06<13:27, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████▏ | 1558/1784 [2:13:06<13:27, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2847, 'learning_rate': 1.7912772585669782e-06, 'epoch': 0.87} 87%|███████████████████████████████████████████████████████████████████▎ | 1559/1784 [2:13:10<13:23, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████▎ | 1559/1784 [2:13:10<13:23, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9438, 'learning_rate': 1.7834890965732088e-06, 'epoch': 0.87} 87%|███████████████████████████████████████████████████████████████████▎ | 1559/1784 [2:13:10<13:23, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:08,120 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████▎ | 1560/1784 [2:13:13<13:20, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|███████████████████████████████████████████████████████████████████▎ | 1560/1784 [2:13:13<13:20, 3.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▍ | 1561/1784 [2:13:17<13:17, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▍ | 1561/1784 [2:13:17<13:17, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▍ | 1561/1784 [2:13:17<13:17, 3.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▍ | 1562/1784 [2:13:20<13:10, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▍ | 1562/1784 [2:13:20<13:10, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▍ | 1562/1784 [2:13:20<13:10, 3.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▍ | 1563/1784 [2:13:24<13:04, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:50:34,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:50:34,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0038, 'learning_rate': 1.7445482866043614e-06, 'epoch': 0.88} [WARNING|modeling_utils.py:388] 2022-02-28 11:50:34,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▌ | 1565/1784 [2:13:31<12:54, 3.54s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▌ | 1565/1784 [2:13:31<12:54, 3.54s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▌ | 1565/1784 [2:13:31<12:54, 3.54s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▌ | 1566/1784 [2:13:34<12:46, 3.52s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:50:45,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:50:45,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.85, 'learning_rate': 1.7211838006230531e-06, 'epoch': 0.88} [WARNING|modeling_utils.py:388] 2022-02-28 11:50:45,246 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▋ | 1568/1784 [2:13:41<12:34, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▋ | 1568/1784 [2:13:41<12:34, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▋ | 1568/1784 [2:13:41<12:34, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▋ | 1569/1784 [2:13:45<12:26, 3.47s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▋ | 1569/1784 [2:13:45<12:26, 3.47s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▋ | 1569/1784 [2:13:45<12:26, 3.47s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▊ | 1570/1784 [2:13:48<12:26, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▊ | 1570/1784 [2:13:48<12:26, 3.49s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:50:59,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:50:59,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:50:59,072 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▊ | 1572/1784 [2:13:55<12:10, 3.44s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▊ | 1572/1784 [2:13:55<12:10, 3.44s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▉ | 1573/1784 [2:13:59<12:03, 3.43s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▉ | 1573/1784 [2:13:59<12:03, 3.43s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:09,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:09,220 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9676, 'learning_rate': 1.6666666666666667e-06, 'epoch': 0.88} 88%|███████████████████████████████████████████████████████████████████▉ | 1575/1784 [2:14:05<11:46, 3.38s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|███████████████████████████████████████████████████████████████████▉ | 1575/1784 [2:14:05<11:46, 3.38s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:15,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:15,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:15,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|████████████████████████████████████████████████████████████████████ | 1577/1784 [2:14:12<11:26, 3.32s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|████████████████████████████████████████████████████████████████████ | 1577/1784 [2:14:12<11:26, 3.32s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9374, 'learning_rate': 1.6433021806853584e-06, 'epoch': 0.88} 88%|████████████████████████████████████████████████████████████████████ | 1578/1784 [2:14:15<11:19, 3.30s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|████████████████████████████████████████████████████████████████████ | 1578/1784 [2:14:15<11:19, 3.30s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:25,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:25,512 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.136, 'learning_rate': 1.6277258566978193e-06, 'epoch': 0.89} 89%|████████████████████████████████████████████████████████████████████▏ | 1580/1784 [2:14:21<10:59, 3.23s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▏ | 1580/1784 [2:14:21<10:59, 3.23s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:31,827 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:31,827 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2275, 'learning_rate': 1.6121495327102804e-06, 'epoch': 0.89} 89%|████████████████████████████████████████████████████████████████████▎ | 1582/1784 [2:14:28<10:43, 3.18s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▎ | 1582/1784 [2:14:28<10:43, 3.18s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:38,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:38,026 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1318, 'learning_rate': 1.5965732087227416e-06, 'epoch': 0.89} 89%|████████████████████████████████████████████████████████████████████▎ | 1584/1784 [2:14:34<10:22, 3.11s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▎ | 1584/1784 [2:14:34<10:22, 3.11s/it]g-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:44,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:44,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:50:22,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0312, 'learning_rate': 1.5809968847352025e-06, 'epoch': 0.89} 89%|████████████████████████████████████████████████████████████████████▍ | 1586/1784 [2:14:40<10:01, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▍ | 1586/1784 [2:14:40<10:01, 3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▍ | 1587/1784 [2:14:43<09:48, 2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▍ | 1587/1784 [2:14:43<09:48, 2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:52,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:52,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:51:52,626 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:48,490 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▌ | 1589/1784 [2:14:48<09:18, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▌ | 1589/1784 [2:14:48<09:18, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▋ | 1590/1784 [2:14:51<09:04, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▋ | 1590/1784 [2:14:51<09:04, 2.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9825, 'learning_rate': 1.542056074766355e-06, 'epoch': 0.89} [WARNING|modeling_utils.py:388] 2022-02-28 11:52:00,632 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:52:03,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:52:03,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0182, 'learning_rate': 1.5264797507788162e-06, 'epoch': 0.89} [WARNING|modeling_utils.py:388] 2022-02-28 11:52:03,107 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:51:56,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▊ | 1593/1784 [2:14:58<08:09, 2.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:06,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▊ | 1593/1784 [2:14:58<08:09, 2.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:06,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▊ | 1594/1784 [2:15:00<07:45, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:08,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▊ | 1594/1784 [2:15:00<07:45, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:08,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▊ | 1595/1784 [2:15:02<07:14, 2.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:10,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▊ | 1595/1784 [2:15:02<07:14, 2.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:10,541 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▉ | 1596/1784 [2:15:04<06:42, 2.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:12,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|████████████████████████████████████████████████████████████████████▉ | 1596/1784 [2:15:04<06:42, 2.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:12,210 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2711, 'learning_rate': 1.4875389408099689e-06, 'epoch': 0.9} 90%|████████████████████████████████████████████████████████████████████▉ | 1598/1784 [2:15:07<05:38, 1.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:13,746 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|████████████████████████████████████████████████████████████████████▉ | 1598/1784 [2:15:07<05:38, 1.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:13,746 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1599/1784 [2:15:08<05:09, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:16,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1599/1784 [2:15:08<05:09, 1.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:16,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1600/1784 [2:15:10<05:10, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:16,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1600/1784 [2:15:10<05:10, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1600/1784 [2:15:10<05:10, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1601/1784 [2:15:14<07:21, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1601/1784 [2:15:14<07:21, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████ | 1601/1784 [2:15:14<07:21, 2.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:19,690 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1602/1784 [2:15:18<08:31, 2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1602/1784 [2:15:18<08:31, 2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1603/1784 [2:15:22<09:15, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1603/1784 [2:15:22<09:15, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1603/1784 [2:15:22<09:15, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1604/1784 [2:15:25<09:46, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1604/1784 [2:15:25<09:46, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▏ | 1604/1784 [2:15:25<09:46, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▎ | 1605/1784 [2:15:29<10:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▎ | 1605/1784 [2:15:29<10:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▎ | 1605/1784 [2:15:29<10:02, 3.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▎ | 1606/1784 [2:15:33<10:13, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:52:43,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:52:43,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0761, 'learning_rate': 1.4096573208722741e-06, 'epoch': 0.9} [WARNING|modeling_utils.py:388] 2022-02-28 11:52:43,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1608/1784 [2:15:40<10:16, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1608/1784 [2:15:40<10:16, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1608/1784 [2:15:40<10:16, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1609/1784 [2:15:43<10:13, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1609/1784 [2:15:43<10:13, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1609/1784 [2:15:43<10:13, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1610/1784 [2:15:47<10:11, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▍ | 1610/1784 [2:15:47<10:11, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:52:57,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:52:57,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:52:57,602 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▌ | 1612/1784 [2:15:54<10:02, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▌ | 1612/1784 [2:15:54<10:02, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▌ | 1612/1784 [2:15:54<10:02, 3.50s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|█████████████████████████████████████████████████████████████████████▌ | 1613/1784 [2:15:57<10:00, 3.51s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:08,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:08,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9159, 'learning_rate': 1.3551401869158879e-06, 'epoch': 0.9} [WARNING|modeling_utils.py:388] 2022-02-28 11:53:08,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▋ | 1615/1784 [2:16:04<09:45, 3.46s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▋ | 1615/1784 [2:16:04<09:45, 3.46s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▋ | 1615/1784 [2:16:04<09:45, 3.46s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▋ | 1616/1784 [2:16:08<09:39, 3.45s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▋ | 1616/1784 [2:16:08<09:39, 3.45s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▋ | 1616/1784 [2:16:08<09:39, 3.45s/it]g-point operations will not be computed-28 11:52:27,204 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▊ | 1617/1784 [2:16:11<09:33, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▊ | 1617/1784 [2:16:11<09:33, 3.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▊ | 1618/1784 [2:16:14<09:26, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▊ | 1618/1784 [2:16:14<09:26, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▊ | 1618/1784 [2:16:14<09:26, 3.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▉ | 1619/1784 [2:16:18<09:20, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:28,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:28,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1666, 'learning_rate': 1.308411214953271e-06, 'epoch': 0.91} [WARNING|modeling_utils.py:388] 2022-02-28 11:53:28,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▉ | 1621/1784 [2:16:24<09:03, 3.34s/it]g-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▉ | 1621/1784 [2:16:24<09:03, 3.34s/it]g-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|█████████████████████████████████████████████████████████████████████▉ | 1621/1784 [2:16:24<09:03, 3.34s/it]g-point operations will not be computed-28 11:53:20,028 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████ | 1622/1784 [2:16:28<09:00, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████ | 1623/1784 [2:16:31<08:57, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████ | 1623/1784 [2:16:31<08:57, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9593, 'learning_rate': 1.2850467289719625e-06, 'epoch': 0.91} 91%|██████████████████████████████████████████████████████████████████████ | 1623/1784 [2:16:31<08:57, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████ | 1624/1784 [2:16:34<08:51, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:44,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:44,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1008, 'learning_rate': 1.2694704049844237e-06, 'epoch': 0.91} [WARNING|modeling_utils.py:388] 2022-02-28 11:53:44,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▏ | 1626/1784 [2:16:41<08:43, 3.32s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:51,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:53:51,411 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1622, 'learning_rate': 1.2538940809968846e-06, 'epoch': 0.91} 91%|██████████████████████████████████████████████████████████████████████▎ | 1628/1784 [2:16:47<08:28, 3.26s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▎ | 1628/1784 [2:16:47<08:28, 3.26s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9802, 'learning_rate': 1.2461059190031154e-06, 'epoch': 0.91} 91%|██████████████████████████████████████████████████████████████████████▎ | 1628/1784 [2:16:47<08:28, 3.26s/it]g-point operations will not be computed-28 11:53:36,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▎ | 1629/1784 [2:16:50<08:17, 3.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▎ | 1630/1784 [2:16:53<08:09, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▎ | 1630/1784 [2:16:53<08:09, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9442, 'learning_rate': 1.2305295950155765e-06, 'epoch': 0.91} 91%|██████████████████████████████████████████████████████████████████████▎ | 1630/1784 [2:16:53<08:09, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:53:59,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▍ | 1631/1784 [2:16:57<08:06, 3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▍ | 1632/1784 [2:17:00<07:59, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|██████████████████████████████████████████████████████████████████████▍ | 1632/1784 [2:17:00<07:59, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.159, 'learning_rate': 1.2149532710280374e-06, 'epoch': 0.91} 91%|██████████████████████████████████████████████████████████████████████▍ | 1632/1784 [2:17:00<07:59, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:05,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▍ | 1633/1784 [2:17:03<07:52, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▌ | 1634/1784 [2:17:06<07:42, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▌ | 1634/1784 [2:17:06<07:42, 3.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:15,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:15,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1648, 'learning_rate': 1.1915887850467291e-06, 'epoch': 0.92} [WARNING|modeling_utils.py:388] 2022-02-28 11:54:15,994 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:11,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▌ | 1636/1784 [2:17:11<07:18, 2.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▋ | 1637/1784 [2:17:14<07:08, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▋ | 1637/1784 [2:17:14<07:08, 2.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1987, 'learning_rate': 1.1682242990654206e-06, 'epoch': 0.92} [WARNING|modeling_utils.py:388] 2022-02-28 11:54:24,320 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:20,289 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▋ | 1639/1784 [2:17:20<06:43, 2.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▊ | 1640/1784 [2:17:22<06:27, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|██████████████████████████████████████████████████████████████████████▊ | 1640/1784 [2:17:22<06:27, 2.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:31,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:31,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:33,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:33,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:36,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:36,046 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:37,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:37,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:39,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:39,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:41,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:41,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5605, 'learning_rate': 1.105919003115265e-06, 'epoch': 0.92} [WARNING|modeling_utils.py:388] 2022-02-28 11:54:43,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:43,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:45,183 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4507, 'learning_rate': 1.0825545171339565e-06, 'epoch': 0.92} [WARNING|modeling_utils.py:388] 2022-02-28 11:54:46,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:46,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:46,896 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:50,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:54,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:54:54,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1179, 'learning_rate': 1.059190031152648e-06, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▎ | 1653/1784 [2:17:51<06:36, 3.03s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▎ | 1653/1784 [2:17:51<06:36, 3.03s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0829, 'learning_rate': 1.0514018691588785e-06, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▍ | 1654/1784 [2:17:55<06:57, 3.21s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▍ | 1654/1784 [2:17:55<06:57, 3.21s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0633, 'learning_rate': 1.043613707165109e-06, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▍ | 1655/1784 [2:17:58<07:11, 3.34s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▍ | 1655/1784 [2:17:58<07:11, 3.34s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9353, 'learning_rate': 1.0358255451713396e-06, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▍ | 1655/1784 [2:17:58<07:11, 3.34s/it]g-point operations will not be computed-28 11:54:28,267 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▍ | 1656/1784 [2:18:02<07:15, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▌ | 1657/1784 [2:18:05<07:20, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▌ | 1657/1784 [2:18:05<07:20, 3.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2134, 'learning_rate': 1.0202492211838008e-06, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▌ | 1658/1784 [2:18:09<07:21, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▌ | 1658/1784 [2:18:09<07:21, 3.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1343, 'learning_rate': 1.0124610591900311e-06, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▌ | 1659/1784 [2:18:13<07:20, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▌ | 1659/1784 [2:18:13<07:20, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:55:23,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:55:23,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1671, 'learning_rate': 9.968847352024923e-07, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▋ | 1661/1784 [2:18:20<07:14, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▋ | 1661/1784 [2:18:20<07:14, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2345, 'learning_rate': 9.890965732087228e-07, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▋ | 1662/1784 [2:18:23<07:10, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▋ | 1662/1784 [2:18:23<07:10, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2383, 'learning_rate': 9.813084112149534e-07, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▋ | 1662/1784 [2:18:23<07:10, 3.53s/it]g-point operations will not be computed-28 11:55:11,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▊ | 1663/1784 [2:18:27<07:05, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▊ | 1664/1784 [2:18:30<07:01, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▊ | 1664/1784 [2:18:30<07:01, 3.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1784, 'learning_rate': 9.657320872274143e-07, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▊ | 1665/1784 [2:18:34<06:55, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▊ | 1665/1784 [2:18:34<06:55, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9555, 'learning_rate': 9.579439252336449e-07, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▊ | 1665/1784 [2:18:34<06:55, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▉ | 1666/1784 [2:18:37<06:50, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:55:47,801 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:55:47,801 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9806, 'learning_rate': 9.423676012461059e-07, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▉ | 1668/1784 [2:18:44<06:39, 3.44s/it]g-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|███████████████████████████████████████████████████████████████████████▉ | 1668/1784 [2:18:44<06:39, 3.44s/it]g-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0233, 'learning_rate': 9.345794392523365e-07, 'epoch': 0.93} 93%|███████████████████████████████████████████████████████████████████████▉ | 1668/1784 [2:18:44<06:39, 3.44s/it]g-point operations will not be computed-28 11:55:35,843 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████ | 1669/1784 [2:18:47<06:33, 3.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████ | 1670/1784 [2:18:51<06:27, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████ | 1670/1784 [2:18:51<06:27, 3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0175, 'learning_rate': 9.190031152647975e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████ | 1671/1784 [2:18:54<06:22, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████ | 1671/1784 [2:18:54<06:22, 3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:04,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:04,611 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1891, 'learning_rate': 9.034267912772586e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████▏ | 1673/1784 [2:19:01<06:10, 3.34s/it]g-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▏ | 1673/1784 [2:19:01<06:10, 3.34s/it]g-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0017, 'learning_rate': 8.956386292834891e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████▏ | 1673/1784 [2:19:01<06:10, 3.34s/it]g-point operations will not be computed-28 11:55:56,318 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▎ | 1674/1784 [2:19:04<06:04, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▎ | 1675/1784 [2:19:07<05:58, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▎ | 1675/1784 [2:19:07<05:58, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3383, 'learning_rate': 8.800623052959501e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████▎ | 1675/1784 [2:19:07<05:58, 3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▎ | 1676/1784 [2:19:10<05:51, 3.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:20,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:20,756 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9002, 'learning_rate': 8.644859813084113e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████▍ | 1678/1784 [2:19:17<05:41, 3.22s/it]g-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▍ | 1678/1784 [2:19:17<05:41, 3.22s/it]g-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1325, 'learning_rate': 8.566978193146417e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████▍ | 1678/1784 [2:19:17<05:41, 3.22s/it]g-point operations will not be computed-28 11:56:12,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▍ | 1679/1784 [2:19:20<05:35, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▌ | 1680/1784 [2:19:23<05:29, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▌ | 1680/1784 [2:19:23<05:29, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2532, 'learning_rate': 8.411214953271029e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████▌ | 1680/1784 [2:19:23<05:29, 3.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:28,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▌ | 1681/1784 [2:19:26<05:25, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▌ | 1682/1784 [2:19:29<05:20, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▌ | 1682/1784 [2:19:29<05:20, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8529, 'learning_rate': 8.255451713395639e-07, 'epoch': 0.94} 94%|████████████████████████████████████████████████████████████████████████▌ | 1682/1784 [2:19:29<05:20, 3.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:34,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▋ | 1683/1784 [2:19:32<05:13, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▋ | 1684/1784 [2:19:35<05:06, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|████████████████████████████████████████████████████████████████████████▋ | 1684/1784 [2:19:35<05:06, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:45,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:45,312 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2004, 'learning_rate': 8.021806853582555e-07, 'epoch': 0.94} 95%|████████████████████████████████████████████████████████████████████████▊ | 1686/1784 [2:19:41<04:51, 2.98s/it]g-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████▊ | 1686/1784 [2:19:41<04:51, 2.98s/it]g-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:51,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:51,044 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:53,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:56:53,780 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:40,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2725, 'learning_rate': 7.788161993769471e-07, 'epoch': 0.95} 95%|████████████████████████████████████████████████████████████████████████▉ | 1689/1784 [2:19:49<04:28, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████▉ | 1689/1784 [2:19:49<04:28, 2.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████▉ | 1690/1784 [2:19:52<04:16, 2.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|████████████████████████████████████████████████████████████████████████▉ | 1690/1784 [2:19:52<04:16, 2.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:01,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:01,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:03,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:03,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:05,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:05,656 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:07,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:07,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:09,446 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:09,446 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:11,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:11,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:13,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:13,964 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2427, 'learning_rate': 7.087227414330218e-07, 'epoch': 0.95} [WARNING|modeling_utils.py:388] 2022-02-28 11:57:15,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:15,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:16,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:16,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5237, 'learning_rate': 6.853582554517134e-07, 'epoch': 0.95} [WARNING|modeling_utils.py:388] 2022-02-28 11:57:20,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:20,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2145, 'learning_rate': 6.775700934579439e-07, 'epoch': 0.95} 95%|█████████████████████████████████████████████████████████████████████████▍ | 1702/1784 [2:20:17<03:45, 2.75s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████▍ | 1702/1784 [2:20:17<03:45, 2.75s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9877, 'learning_rate': 6.697819314641744e-07, 'epoch': 0.95} 95%|█████████████████████████████████████████████████████████████████████████▍ | 1702/1784 [2:20:17<03:45, 2.75s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|█████████████████████████████████████████████████████████████████████████▌ | 1703/1784 [2:20:21<04:06, 3.05s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:32,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:32,092 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9723, 'learning_rate': 6.542056074766355e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▌ | 1705/1784 [2:20:28<04:26, 3.38s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▌ | 1705/1784 [2:20:28<04:26, 3.38s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.005, 'learning_rate': 6.46417445482866e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▋ | 1706/1784 [2:20:32<04:30, 3.47s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▋ | 1706/1784 [2:20:32<04:30, 3.47s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9488, 'learning_rate': 6.386292834890966e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▋ | 1707/1784 [2:20:36<04:31, 3.53s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▋ | 1707/1784 [2:20:36<04:31, 3.53s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9929, 'learning_rate': 6.308411214953271e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▋ | 1707/1784 [2:20:36<04:31, 3.53s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▋ | 1708/1784 [2:20:39<04:30, 3.55s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:50,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:57:50,388 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.284, 'learning_rate': 6.152647975077883e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▊ | 1710/1784 [2:20:47<04:24, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▊ | 1710/1784 [2:20:47<04:24, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2583, 'learning_rate': 6.074766355140187e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▊ | 1711/1784 [2:20:50<04:20, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▊ | 1711/1784 [2:20:50<04:20, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9819, 'learning_rate': 5.996884735202493e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▊ | 1711/1784 [2:20:50<04:20, 3.57s/it]g-point operations will not be computed-28 11:56:57,828 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▉ | 1712/1784 [2:20:54<04:15, 3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▉ | 1713/1784 [2:20:57<04:11, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▉ | 1713/1784 [2:20:57<04:11, 3.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3626, 'learning_rate': 5.841121495327103e-07, 'epoch': 0.96} 96%|█████████████████████████████████████████████████████████████████████████▉ | 1714/1784 [2:21:01<04:07, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|█████████████████████████████████████████████████████████████████████████▉ | 1714/1784 [2:21:01<04:07, 3.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0682, 'learning_rate': 5.763239875389409e-07, 'epoch': 0.96} 96%|██████████████████████████████████████████████████████████████████████████ | 1715/1784 [2:21:04<04:02, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████ | 1715/1784 [2:21:04<04:02, 3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:14,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:14,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1014, 'learning_rate': 5.607476635514019e-07, 'epoch': 0.96} 96%|██████████████████████████████████████████████████████████████████████████ | 1717/1784 [2:21:11<03:52, 3.47s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████ | 1717/1784 [2:21:11<03:52, 3.47s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4454, 'learning_rate': 5.529595015576325e-07, 'epoch': 0.96} 96%|██████████████████████████████████████████████████████████████████████████▏ | 1718/1784 [2:21:14<03:47, 3.45s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████▏ | 1718/1784 [2:21:14<03:47, 3.45s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:25,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:25,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8065, 'learning_rate': 5.373831775700935e-07, 'epoch': 0.96} 96%|██████████████████████████████████████████████████████████████████████████▏ | 1720/1784 [2:21:21<03:39, 3.43s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████▏ | 1720/1784 [2:21:21<03:39, 3.43s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9721, 'learning_rate': 5.29595015576324e-07, 'epoch': 0.96} 96%|██████████████████████████████████████████████████████████████████████████▎ | 1721/1784 [2:21:25<03:34, 3.41s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|██████████████████████████████████████████████████████████████████████████▎ | 1721/1784 [2:21:25<03:34, 3.41s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:35,314 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:35,314 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0803, 'learning_rate': 5.140186915887851e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▎ | 1723/1784 [2:21:31<03:25, 3.36s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▎ | 1723/1784 [2:21:31<03:25, 3.36s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2579, 'learning_rate': 5.062305295950156e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▎ | 1723/1784 [2:21:31<03:25, 3.36s/it]g-point operations will not be computed-28 11:58:02,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▍ | 1724/1784 [2:21:35<03:20, 3.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▍ | 1725/1784 [2:21:38<03:15, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▍ | 1725/1784 [2:21:38<03:15, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1518, 'learning_rate': 4.906542056074767e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▍ | 1725/1784 [2:21:38<03:15, 3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:43,592 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▍ | 1726/1784 [2:21:41<03:09, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▌ | 1727/1784 [2:21:44<03:06, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▌ | 1727/1784 [2:21:44<03:06, 3.27s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1883, 'learning_rate': 4.7507788161993773e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▌ | 1728/1784 [2:21:47<03:02, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▌ | 1728/1784 [2:21:47<03:02, 3.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:57,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:58:57,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3411, 'learning_rate': 4.5950155763239876e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▋ | 1730/1784 [2:21:54<02:52, 3.20s/it]g-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▋ | 1730/1784 [2:21:54<02:52, 3.20s/it]g-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1567, 'learning_rate': 4.517133956386293e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▋ | 1730/1784 [2:21:54<02:52, 3.20s/it]g-point operations will not be computed-28 11:58:50,008 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▋ | 1731/1784 [2:21:57<02:48, 3.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▊ | 1732/1784 [2:22:00<02:44, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▊ | 1732/1784 [2:22:00<02:44, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0856, 'learning_rate': 4.3613707165109035e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▊ | 1732/1784 [2:22:00<02:44, 3.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:05,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▊ | 1733/1784 [2:22:03<02:39, 3.13s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▊ | 1734/1784 [2:22:06<02:35, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▊ | 1734/1784 [2:22:06<02:35, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2605, 'learning_rate': 4.2056074766355143e-07, 'epoch': 0.97} 97%|██████████████████████████████████████████████████████████████████████████▊ | 1734/1784 [2:22:06<02:35, 3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:11,985 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▉ | 1735/1784 [2:22:09<02:30, 3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▉ | 1736/1784 [2:22:12<02:24, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|██████████████████████████████████████████████████████████████████████████▉ | 1736/1784 [2:22:12<02:24, 3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:22,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:22,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:17,962 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8847, 'learning_rate': 3.97196261682243e-07, 'epoch': 0.97} 97%|███████████████████████████████████████████████████████████████████████████ | 1738/1784 [2:22:18<02:13, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|███████████████████████████████████████████████████████████████████████████ | 1738/1784 [2:22:18<02:13, 2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|███████████████████████████████████████████████████████████████████████████ | 1739/1784 [2:22:20<02:07, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|███████████████████████████████████████████████████████████████████████████ | 1739/1784 [2:22:20<02:07, 2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:30,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:30,138 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:32,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:32,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1438, 'learning_rate': 3.660436137071651e-07, 'epoch': 0.98} [WARNING|modeling_utils.py:388] 2022-02-28 11:59:32,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:26,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▏ | 1742/1784 [2:22:28<01:47, 2.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:36,066 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▏ | 1743/1784 [2:22:30<01:40, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:38,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▏ | 1743/1784 [2:22:30<01:40, 2.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:38,236 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▎ | 1744/1784 [2:22:32<01:34, 2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:40,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▎ | 1744/1784 [2:22:32<01:34, 2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:40,279 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▎ | 1746/1784 [2:22:36<01:20, 2.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:42,184 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▎ | 1746/1784 [2:22:36<01:20, 2.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:42,184 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1639, 'learning_rate': 3.348909657320872e-07, 'epoch': 0.98} 98%|███████████████████████████████████████████████████████████████████████████▍ | 1747/1784 [2:22:37<01:13, 1.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:43,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▍ | 1747/1784 [2:22:37<01:13, 1.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:43,942 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▍ | 1748/1784 [2:22:39<01:06, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:46,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▍ | 1748/1784 [2:22:39<01:06, 1.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:46,971 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▍ | 1749/1784 [2:22:40<00:59, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:48,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▍ | 1749/1784 [2:22:40<00:59, 1.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:48,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.31, 'learning_rate': 3.0373831775700936e-07, 'epoch': 0.98} 98%|███████████████████████████████████████████████████████████████████████████▌ | 1750/1784 [2:22:42<00:57, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:48,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▌ | 1750/1784 [2:22:42<00:57, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▌ | 1750/1784 [2:22:42<00:57, 1.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▌ | 1751/1784 [2:22:46<01:18, 2.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:57,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 11:59:57,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0883, 'learning_rate': 2.8037383177570096e-07, 'epoch': 0.98} 98%|███████████████████████████████████████████████████████████████████████████▋ | 1753/1784 [2:22:53<01:35, 3.08s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▋ | 1753/1784 [2:22:53<01:35, 3.08s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3001, 'learning_rate': 2.7258566978193147e-07, 'epoch': 0.98} 98%|███████████████████████████████████████████████████████████████████████████▋ | 1754/1784 [2:22:57<01:37, 3.24s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▋ | 1754/1784 [2:22:57<01:37, 3.24s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0114, 'learning_rate': 2.64797507788162e-07, 'epoch': 0.98} 98%|███████████████████████████████████████████████████████████████████████████▋ | 1755/1784 [2:23:01<01:37, 3.37s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▋ | 1755/1784 [2:23:01<01:37, 3.37s/it]g-point operations will not be computed-28 11:59:51,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2639, 'learning_rate': 2.5700934579439255e-07, 'epoch': 0.98} 98%|███████████████████████████████████████████████████████████████████████████▊ | 1756/1784 [2:23:04<01:36, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▊ | 1756/1784 [2:23:04<01:36, 3.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▊ | 1757/1784 [2:23:08<01:34, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|███████████████████████████████████████████████████████████████████████████▊ | 1757/1784 [2:23:08<01:34, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0116, 'learning_rate': 2.414330218068536e-07, 'epoch': 0.98} 99%|███████████████████████████████████████████████████████████████████████████▉ | 1758/1784 [2:23:11<01:30, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|███████████████████████████████████████████████████████████████████████████▉ | 1758/1784 [2:23:11<01:30, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0707, 'learning_rate': 2.3364485981308412e-07, 'epoch': 0.99} 99%|███████████████████████████████████████████████████████████████████████████▉ | 1758/1784 [2:23:11<01:30, 3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|███████████████████████████████████████████████████████████████████████████▉ | 1759/1784 [2:23:15<01:27, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|███████████████████████████████████████████████████████████████████████████▉ | 1759/1784 [2:23:15<01:27, 3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:25,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:25,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:25,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████ | 1761/1784 [2:23:22<01:19, 3.46s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████ | 1761/1784 [2:23:22<01:19, 3.46s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████ | 1762/1784 [2:23:25<01:15, 3.44s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████ | 1762/1784 [2:23:25<01:15, 3.44s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.9068, 'learning_rate': 2.0249221183800623e-07, 'epoch': 0.99} [WARNING|modeling_utils.py:388] 2022-02-28 12:00:35,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:35,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:35,992 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▏| 1764/1784 [2:23:32<01:08, 3.41s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:42,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:42,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.8871, 'learning_rate': 1.7912772585669783e-07, 'epoch': 0.99} [WARNING|modeling_utils.py:388] 2022-02-28 12:00:42,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:46,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:46,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:00:46,058 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1767/1784 [2:23:42<00:57, 3.38s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1767/1784 [2:23:42<00:57, 3.38s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1767/1784 [2:23:42<00:57, 3.38s/it]g-point operations will not be computed-28 12:00:13,551 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1768/1784 [2:23:45<00:53, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1768/1784 [2:23:45<00:53, 3.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1769/1784 [2:23:49<00:49, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1769/1784 [2:23:49<00:49, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▎| 1769/1784 [2:23:49<00:49, 3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▍| 1770/1784 [2:23:52<00:46, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▍| 1770/1784 [2:23:52<00:46, 3.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:02,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:02,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:02,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▍| 1772/1784 [2:23:58<00:38, 3.23s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▍| 1772/1784 [2:23:58<00:38, 3.23s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:08,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:08,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:08,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▌| 1774/1784 [2:24:04<00:31, 3.10s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 99%|████████████████████████████████████████████████████████████████████████████▌| 1774/1784 [2:24:04<00:31, 3.10s/it]g-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:14,408 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:17,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-02-28 12:01:17,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1723, 'learning_rate': 9.345794392523364e-08, 'epoch': 1.0} [WARNING|modeling_utils.py:388] 2022-02-28 12:01:17,206 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 12:00:54,396 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▋| 1777/1784 [2:24:13<00:20, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:21,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▋| 1777/1784 [2:24:13<00:20, 2.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:21,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▋| 1778/1784 [2:24:15<00:16, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:23,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▋| 1778/1784 [2:24:15<00:16, 2.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:23,516 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▊| 1779/1784 [2:24:17<00:12, 2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:25,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▊| 1779/1784 [2:24:17<00:12, 2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:25,617 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▊| 1780/1784 [2:24:19<00:09, 2.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:27,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▊| 1780/1784 [2:24:19<00:09, 2.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:27,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▉| 1782/1784 [2:24:22<00:03, 1.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:28,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▉| 1782/1784 [2:24:22<00:03, 1.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:28,953 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0406, 'learning_rate': 4.672897196261682e-08, 'epoch': 1.0} 100%|████████████████████████████████████████████████████████████████████████████▉| 1783/1784 [2:24:24<00:01, 1.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:31,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 100%|████████████████████████████████████████████████████████████████████████████▉| 1783/1784 [2:24:24<00:01, 1.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 12:01:31,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 3.5732, 'learning_rate': 3.1152647975077883e-08, 'epoch': 1.0} [INFO|trainer.py:2114] 2022-02-28 12:01:32,074 >> Saving model checkpoint to ./=)█| 1784/1784 [2:24:25<00:00, 1.55s/it][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|trainer.py:2114] 2022-02-28 12:01:48,841 >> Saving model checkpoint to ./ ./pytorch_model.bin:25<00:00, 1.55s/it][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|modeling_utils.py:1081] 2022-02-28 12:02:05,435 >> Model weights saved in ./pytorch_model.bin:25<00:00, 1.55s/it][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 0%|▏ | 13.0M/2.99G [00:01<03:55, 13.6MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 2%|▋ | 48.2M/2.99G [00:03<03:02, 17.3MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 3%|█▎ | 85.5M/2.99G [00:05<02:46, 18.7MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 4%|█▉ | 126M/2.99G [00:07<02:33, 20.1MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 5%|██▌ | 165M/2.99G [00:09<02:30, 20.1MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 7%|███▏ | 204M/2.99G [00:11<02:26, 20.4MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 7%|███▏ | 204M/2.99G [00:11<02:26, 20.4MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.0M/53.0M [00:13<00:00, 16.9MB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|████████████| 53.0M/53.0M [02:49<00:00, 196kB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|████████████| 53.0M/53.0M [02:49<00:00, 196kB/s][INFO|trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 02/28/2022 12:06:33 - WARNING - huggingface_hub.repository - To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-search [INFO|modelcard.py:460] 2022-02-28 12:06:36,447 >> Dropping the following result as it does not have all the necessary fields:trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 25%|██▊ | 13.3M/53.0M [00:01<00:02, 13.9MB/s]ields:trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 25%|██▊ | 13.3M/53.0M [00:01<00:02, 13.9MB/s]ields:trainer.py:1492] 2022-02-28 12:01:32,072 >> 1,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 02/28/2022 12:06:43 - WARNING - huggingface_hub.repository - To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-search Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 65%|███████▏ | 34.4M/53.0M [00:02<00:01, 18.7MB/s]To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-searchimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 65%|███████▏ | 34.4M/53.0M [00:02<00:01, 18.7MB/s]To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-searchimate the number of tokens of the input, floating-point operations will not be computed ***** train metrics ***** epoch = 1.0 train_loss = 4.242 train_runtime = 2:26:09.63 train_samples = 28538 train_samples_per_second = 3.254 train_steps_per_second = 0.203 [INFO|trainer.py:2366] 2022-02-28 12:06:46,261 >> Num examples = 2642in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. 0%| | 0/331 [00:00> Saving model checkpoint to ./ | 3/331 [00:05<11:09, 2.04s/it] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. [INFO|modeling_utils.py:1081] 2022-02-28 12:26:18,192 >> Model weights saved in ./pytorch_model.bin:05<11:09, 2.04s/it] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 27%|███ | 14.5M/53.1M [00:01<00:02, 15.2MB/s] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 27%|███ | 14.5M/53.1M [00:01<00:02, 15.2MB/s] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. 02/28/2022 12:26:49 - WARNING - huggingface_hub.repository - To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-search Upload file wandb/run-20220228_093705-yn2gmwrw/run-yn2gmwrw.wandb: 100%|███████████| 53.1M/53.1M [00:03<00:00, 18.5MB/s] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. return ModelInfo(**d)f.finetuned_from)formers/src/transformers/modelcard.py", line 611, in from_trainercard31, in mainule>ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. return ModelInfo(**d)f.finetuned_from)formers/src/transformers/modelcard.py", line 611, in from_trainercard31, in mainule>ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. return ModelInfo(**d)f.finetuned_from)formers/src/transformers/modelcard.py", line 611, in from_trainercard31, in mainule>ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message.