diff --git "a/wandb/run-20220228_163348-20lja62k/files/output.log" "b/wandb/run-20220228_163348-20lja62k/files/output.log"
new file mode 100644--- /dev/null
+++ "b/wandb/run-20220228_163348-20lja62k/files/output.log"
@@ -0,0 +1,1697 @@
+
+
+  0%|                                                                                          | 0/1189 [00:00<?, ?it/s][WARNING|modeling_utils.py:388] 2022-02-28 16:33:53,356 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8825, 'learning_rate': 0.0, 'epoch': 0.0}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:33:55,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  0%|                                                                                | 1/1189 [00:05<1:54:22,  5.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:33:58,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.0739, 'learning_rate': 0.0, 'epoch': 0.0}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:01,084 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  0%|▏                                                                               | 2/1189 [00:10<1:46:54,  5.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:03,734 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.0609, 'learning_rate': 2e-08, 'epoch': 0.0}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:06,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  0%|▏                                                                               | 3/1189 [00:16<1:47:35,  5.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:09,165 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8813, 'learning_rate': 4e-08, 'epoch': 0.0}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:12,022 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  0%|▎                                                                               | 4/1189 [00:21<1:47:57,  5.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:14,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.6007, 'learning_rate': 6.000000000000001e-08, 'epoch': 0.0}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:17,115 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  0%|▎                                                                               | 5/1189 [00:27<1:45:11,  5.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:19,670 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:22,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  1%|▍                                                                               | 6/1189 [00:32<1:43:06,  5.23s/it]
+
+  1%|▍                                                                               | 6/1189 [00:32<1:43:06,  5.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:24,743 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.743, 'learning_rate': 1.0000000000000001e-07, 'epoch': 0.01}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:27,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  1%|▍                                                                               | 7/1189 [00:37<1:42:06,  5.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:29,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7243, 'learning_rate': 1.2000000000000002e-07, 'epoch': 0.01}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:32,297 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  1%|▌                                                                               | 8/1189 [00:42<1:41:15,  5.14s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:34,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.9582, 'learning_rate': 1.2000000000000002e-07, 'epoch': 0.01}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:37,313 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  1%|▌                                                                               | 9/1189 [00:47<1:40:02,  5.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:39,776 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8108, 'learning_rate': 1.4e-07, 'epoch': 0.01}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:42,184 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  1%|▋                                                                              | 10/1189 [00:52<1:38:59,  5.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:44,682 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8627, 'learning_rate': 1.6e-07, 'epoch': 0.01}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:47,074 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  1%|▋                                                                              | 11/1189 [00:56<1:38:00,  4.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:49,518 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:51,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  1%|▊                                                                              | 12/1189 [01:01<1:36:38,  4.93s/it]
+
+  1%|▊                                                                              | 12/1189 [01:01<1:36:38,  4.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:54,321 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:34:56,720 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  1%|▊                                                                              | 13/1189 [01:06<1:36:14,  4.91s/it]
+
+  1%|▊                                                                              | 13/1189 [01:06<1:36:14,  4.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:34:59,198 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7932, 'learning_rate': 2.2e-07, 'epoch': 0.01}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:01,545 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  1%|▉                                                                              | 14/1189 [01:11<1:35:38,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:04,005 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.6951, 'learning_rate': 2.4000000000000003e-07, 'epoch': 0.01}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:06,380 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  1%|▉                                                                              | 15/1189 [01:16<1:35:16,  4.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:08,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:11,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.9232, 'learning_rate': 2.6e-07, 'epoch': 0.01}
+  1%|█                                                                              | 16/1189 [01:21<1:34:33,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:13,576 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:15,889 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  1%|█▏                                                                             | 17/1189 [01:25<1:33:56,  4.81s/it]
+
+  1%|█▏                                                                             | 17/1189 [01:25<1:33:56,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:18,270 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.0363, 'learning_rate': 3.0000000000000004e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:20,558 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▏                                                                             | 18/1189 [01:30<1:33:02,  4.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:22,951 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7808, 'learning_rate': 3.2e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:25,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▎                                                                             | 19/1189 [01:35<1:32:29,  4.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:27,675 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:29,944 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  2%|█▎                                                                             | 20/1189 [01:39<1:32:10,  4.73s/it]
+
+  2%|█▎                                                                             | 20/1189 [01:39<1:32:10,  4.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:32,321 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.9847, 'learning_rate': 3.6e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:34,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▍                                                                             | 21/1189 [01:44<1:31:36,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:36,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7493, 'learning_rate': 3.8e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:39,166 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▍                                                                             | 22/1189 [01:49<1:30:44,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:41,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.9432, 'learning_rate': 4.0000000000000003e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:43,703 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▌                                                                             | 23/1189 [01:53<1:29:55,  4.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:46,042 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.0013, 'learning_rate': 4.2000000000000006e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:48,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▌                                                                             | 24/1189 [01:58<1:29:30,  4.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:50,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7983, 'learning_rate': 4.4e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:52,803 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▋                                                                             | 25/1189 [02:02<1:28:57,  4.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:55,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.9729, 'learning_rate': 4.6000000000000004e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:35:57,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▋                                                                             | 26/1189 [02:07<1:28:04,  4.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:35:59,482 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8627, 'learning_rate': 4.800000000000001e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:01,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▊                                                                             | 27/1189 [02:11<1:27:06,  4.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:03,894 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.6941, 'learning_rate': 5.000000000000001e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:06,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▊                                                                             | 28/1189 [02:15<1:26:31,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:08,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7796, 'learning_rate': 5.2e-07, 'epoch': 0.02}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:10,431 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  2%|█▉                                                                             | 29/1189 [02:20<1:25:56,  4.45s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:12,625 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:14,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|█▉                                                                             | 30/1189 [02:24<1:25:11,  4.41s/it]
+
+  3%|█▉                                                                             | 30/1189 [02:24<1:25:11,  4.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:16,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:18,904 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██                                                                             | 31/1189 [02:28<1:23:34,  4.33s/it]
+
+  3%|██                                                                             | 31/1189 [02:28<1:23:34,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:21,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:23,086 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▏                                                                            | 32/1189 [02:32<1:22:38,  4.29s/it]
+
+  3%|██▏                                                                            | 32/1189 [02:32<1:22:38,  4.29s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:25,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:27,281 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▏                                                                            | 33/1189 [02:37<1:22:03,  4.26s/it]
+
+  3%|██▏                                                                            | 33/1189 [02:37<1:22:03,  4.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:29,409 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:31,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▎                                                                            | 34/1189 [02:41<1:20:54,  4.20s/it]
+
+  3%|██▎                                                                            | 34/1189 [02:41<1:20:54,  4.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:33,423 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:35,380 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▎                                                                            | 35/1189 [02:45<1:19:48,  4.15s/it]
+
+  3%|██▎                                                                            | 35/1189 [02:45<1:19:48,  4.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:37,410 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:39,330 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▍                                                                            | 36/1189 [02:49<1:18:34,  4.09s/it]
+
+  3%|██▍                                                                            | 36/1189 [02:49<1:18:34,  4.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:41,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:43,182 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▍                                                                            | 37/1189 [02:53<1:17:10,  4.02s/it]
+
+  3%|██▍                                                                            | 37/1189 [02:53<1:17:10,  4.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:45,136 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.745, 'learning_rate': 7.000000000000001e-07, 'epoch': 0.03}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:46,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  3%|██▌                                                                            | 38/1189 [02:56<1:15:44,  3.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:48,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.2413, 'learning_rate': 7.2e-07, 'epoch': 0.03}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:50,658 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+
+  3%|██▌                                                                            | 39/1189 [03:00<1:14:12,  3.87s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:52,487 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8466, 'learning_rate': 7.4e-07, 'epoch': 0.03}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:54,230 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▋                                                                            | 40/1189 [03:04<1:12:25,  3.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:56,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:36:57,710 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  3%|██▋                                                                            | 41/1189 [03:07<1:10:36,  3.69s/it]
+
+  3%|██▋                                                                            | 41/1189 [03:07<1:10:36,  3.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:36:59,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7698, 'learning_rate': 7.8e-07, 'epoch': 0.04}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:00,920 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|██▊                                                                            | 42/1189 [03:10<1:07:48,  3.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:02,534 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:03,972 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|██▊                                                                            | 43/1189 [03:13<1:04:54,  3.40s/it]
+
+  4%|██▊                                                                            | 43/1189 [03:13<1:04:54,  3.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:05,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8656, 'learning_rate': 8.200000000000001e-07, 'epoch': 0.04}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:06,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|██▉                                                                            | 44/1189 [03:16<1:01:09,  3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:08,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.0811, 'learning_rate': 8.400000000000001e-07, 'epoch': 0.04}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:09,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███                                                                              | 45/1189 [03:19<57:22,  3.01s/it]
+  4%|███                                                                              | 45/1189 [03:19<57:22,  3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:10,482 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:11,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▏                                                                             | 46/1189 [03:21<53:04,  2.79s/it]
+  4%|███▏                                                                             | 46/1189 [03:21<53:04,  2.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:12,797 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:13,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▏                                                                             | 47/1189 [03:23<49:35,  2.61s/it]
+  4%|███▏                                                                             | 47/1189 [03:23<49:35,  2.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:14,698 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:15,508 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▎                                                                             | 48/1189 [03:25<44:50,  2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:16,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.6641, 'learning_rate': 9.200000000000001e-07, 'epoch': 0.04}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:17,029 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▎                                                                             | 49/1189 [03:26<40:01,  2.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:17,773 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 5.6525, 'learning_rate': 9.200000000000001e-07, 'epoch': 0.04}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:37:18,961 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                             | 50/1189 [03:28<38:41,  2.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:22,042 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                             | 50/1189 [03:28<38:41,  2.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:22,042 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                           | 51/1189 [03:34<1:00:21,  3.18s/it]g-point operations will not be computed-28 16:37:22,042 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                           | 51/1189 [03:34<1:00:21,  3.18s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:27,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                           | 52/1189 [03:40<1:13:05,  3.86s/it]g-point operations will not be computed-28 16:37:27,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                           | 52/1189 [03:40<1:13:05,  3.86s/it]g-point operations will not be computed-28 16:37:27,542 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                           | 52/1189 [03:40<1:13:05,  3.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:32,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▍                                                                           | 52/1189 [03:40<1:13:05,  3.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:32,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▌                                                                           | 53/1189 [03:45<1:21:19,  4.30s/it]g-point operations will not be computed-28 16:37:32,928 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  4%|███▌                                                                           | 53/1189 [03:45<1:21:19,  4.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:38,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▌                                                                           | 54/1189 [03:50<1:27:27,  4.62s/it]g-point operations will not be computed-28 16:37:38,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▌                                                                           | 54/1189 [03:50<1:27:27,  4.62s/it]g-point operations will not be computed-28 16:37:38,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▌                                                                           | 54/1189 [03:50<1:27:27,  4.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:43,747 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▋                                                                           | 55/1189 [03:56<1:32:20,  4.89s/it]g-point operations will not be computed-28 16:37:43,747 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▋                                                                           | 55/1189 [03:56<1:32:20,  4.89s/it]g-point operations will not be computed-28 16:37:43,747 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▋                                                                           | 55/1189 [03:56<1:32:20,  4.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:49,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▋                                                                           | 56/1189 [04:01<1:35:00,  5.03s/it]g-point operations will not be computed-28 16:37:49,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▋                                                                           | 56/1189 [04:01<1:35:00,  5.03s/it]g-point operations will not be computed-28 16:37:49,175 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▋                                                                           | 56/1189 [04:01<1:35:00,  5.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:54,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▊                                                                           | 57/1189 [04:06<1:36:18,  5.10s/it]g-point operations will not be computed-28 16:37:54,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▊                                                                           | 57/1189 [04:06<1:36:18,  5.10s/it]g-point operations will not be computed-28 16:37:54,515 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▊                                                                           | 57/1189 [04:06<1:36:18,  5.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:59,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▊                                                                           | 57/1189 [04:06<1:36:18,  5.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:37:59,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▊                                                                           | 58/1189 [04:12<1:36:25,  5.12s/it]g-point operations will not be computed-28 16:37:59,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▊                                                                           | 58/1189 [04:12<1:36:25,  5.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:04,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▉                                                                           | 59/1189 [04:17<1:35:56,  5.09s/it]g-point operations will not be computed-28 16:38:04,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▉                                                                           | 59/1189 [04:17<1:35:56,  5.09s/it]g-point operations will not be computed-28 16:38:04,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▉                                                                           | 59/1189 [04:17<1:35:56,  5.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:09,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:38:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:38:09,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:38:12,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:38:09,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▉                                                                           | 60/1189 [04:22<1:35:35,  5.08s/it]g-point operations will not be computed-28 16:38:09,796 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|███▉                                                                           | 60/1189 [04:22<1:35:35,  5.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:14,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████                                                                           | 61/1189 [04:27<1:34:50,  5.04s/it]g-point operations will not be computed-28 16:38:14,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████                                                                           | 61/1189 [04:27<1:34:50,  5.04s/it]g-point operations will not be computed-28 16:38:14,789 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████                                                                           | 61/1189 [04:27<1:34:50,  5.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:19,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████                                                                           | 62/1189 [04:32<1:34:35,  5.04s/it]g-point operations will not be computed-28 16:38:19,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████                                                                           | 62/1189 [04:32<1:34:35,  5.04s/it]g-point operations will not be computed-28 16:38:19,836 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████                                                                           | 62/1189 [04:32<1:34:35,  5.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:24,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▏                                                                          | 63/1189 [04:37<1:35:29,  5.09s/it]g-point operations will not be computed-28 16:38:24,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▏                                                                          | 63/1189 [04:37<1:35:29,  5.09s/it]g-point operations will not be computed-28 16:38:24,925 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▏                                                                          | 63/1189 [04:37<1:35:29,  5.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:29,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▎                                                                          | 64/1189 [04:42<1:34:38,  5.05s/it]g-point operations will not be computed-28 16:38:29,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▎                                                                          | 64/1189 [04:42<1:34:38,  5.05s/it]g-point operations will not be computed-28 16:38:29,990 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▎                                                                          | 64/1189 [04:42<1:34:38,  5.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:34,852 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▎                                                                          | 64/1189 [04:42<1:34:38,  5.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:34,852 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▎                                                                          | 65/1189 [04:47<1:33:06,  4.97s/it]g-point operations will not be computed-28 16:38:34,852 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  5%|████▎                                                                          | 65/1189 [04:47<1:33:06,  4.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:39,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▍                                                                          | 66/1189 [04:52<1:32:41,  4.95s/it]g-point operations will not be computed-28 16:38:39,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▍                                                                          | 66/1189 [04:52<1:32:41,  4.95s/it]g-point operations will not be computed-28 16:38:39,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▍                                                                          | 66/1189 [04:52<1:32:41,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:44,748 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▍                                                                          | 66/1189 [04:52<1:32:41,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:44,748 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▍                                                                          | 67/1189 [04:57<1:33:56,  5.02s/it]g-point operations will not be computed-28 16:38:44,748 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▍                                                                          | 67/1189 [04:57<1:33:56,  5.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:49,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▌                                                                          | 68/1189 [05:02<1:33:32,  5.01s/it]g-point operations will not be computed-28 16:38:49,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▌                                                                          | 68/1189 [05:02<1:33:32,  5.01s/it]g-point operations will not be computed-28 16:38:49,892 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▌                                                                          | 68/1189 [05:02<1:33:32,  5.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:54,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▌                                                                          | 69/1189 [05:06<1:32:12,  4.94s/it]g-point operations will not be computed-28 16:38:54,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▌                                                                          | 69/1189 [05:06<1:32:12,  4.94s/it]g-point operations will not be computed-28 16:38:54,717 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▌                                                                          | 69/1189 [05:06<1:32:12,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:38:59,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▋                                                                          | 70/1189 [05:11<1:30:22,  4.85s/it]g-point operations will not be computed-28 16:38:59,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▋                                                                          | 70/1189 [05:11<1:30:22,  4.85s/it]g-point operations will not be computed-28 16:38:59,434 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▋                                                                          | 70/1189 [05:11<1:30:22,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:04,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▋                                                                          | 71/1189 [05:16<1:29:02,  4.78s/it]g-point operations will not be computed-28 16:39:04,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▋                                                                          | 71/1189 [05:16<1:29:02,  4.78s/it]g-point operations will not be computed-28 16:39:04,051 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▋                                                                          | 71/1189 [05:16<1:29:02,  4.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:08,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▊                                                                          | 72/1189 [05:20<1:28:40,  4.76s/it]g-point operations will not be computed-28 16:39:08,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▊                                                                          | 72/1189 [05:20<1:28:40,  4.76s/it]g-point operations will not be computed-28 16:39:08,738 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▊                                                                          | 72/1189 [05:20<1:28:40,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:13,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▊                                                                          | 73/1189 [05:25<1:27:37,  4.71s/it]g-point operations will not be computed-28 16:39:13,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▊                                                                          | 73/1189 [05:25<1:27:37,  4.71s/it]g-point operations will not be computed-28 16:39:13,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▊                                                                          | 73/1189 [05:25<1:27:37,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:18,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        g-point operations will not be computed-28 16:39:18,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        g-point operations will not be computed-28 16:39:18,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▉                                                                          | 74/1189 [05:30<1:27:16,  4.70s/it]g-point operations will not be computed-28 16:39:18,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▉                                                                          | 74/1189 [05:30<1:27:16,  4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:22,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▉                                                                          | 74/1189 [05:30<1:27:16,  4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:22,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▉                                                                          | 75/1189 [05:34<1:26:46,  4.67s/it]g-point operations will not be computed-28 16:39:22,667 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▉                                                                          | 75/1189 [05:34<1:26:46,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:27,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|████▉                                                                          | 75/1189 [05:34<1:26:46,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:27,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|█████                                                                          | 76/1189 [05:39<1:25:47,  4.62s/it]g-point operations will not be computed-28 16:39:27,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|█████                                                                          | 76/1189 [05:39<1:25:47,  4.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:31,739 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|█████                                                                          | 76/1189 [05:39<1:25:47,  4.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:31,739 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|█████                                                                          | 77/1189 [05:43<1:25:05,  4.59s/it]g-point operations will not be computed-28 16:39:31,739 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  6%|█████                                                                          | 77/1189 [05:43<1:25:05,  4.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:36,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▏                                                                         | 78/1189 [05:48<1:24:39,  4.57s/it]g-point operations will not be computed-28 16:39:36,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▏                                                                         | 78/1189 [05:48<1:24:39,  4.57s/it]g-point operations will not be computed-28 16:39:36,244 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▏                                                                         | 78/1189 [05:48<1:24:39,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:40,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▏                                                                         | 79/1189 [05:52<1:23:41,  4.52s/it]g-point operations will not be computed-28 16:39:40,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▏                                                                         | 79/1189 [05:52<1:23:41,  4.52s/it]g-point operations will not be computed-28 16:39:40,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▏                                                                         | 79/1189 [05:52<1:23:41,  4.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:45,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:39:47,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:39:45,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:39:47,282 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:39:45,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▎                                                                         | 80/1189 [05:57<1:23:01,  4.49s/it]g-point operations will not be computed-28 16:39:45,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▎                                                                         | 80/1189 [05:57<1:23:01,  4.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:49,523 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▎                                                                         | 80/1189 [05:57<1:23:01,  4.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:49,523 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▍                                                                         | 81/1189 [06:01<1:21:56,  4.44s/it]g-point operations will not be computed-28 16:39:49,523 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▍                                                                         | 81/1189 [06:01<1:21:56,  4.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:53,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▍                                                                         | 81/1189 [06:01<1:21:56,  4.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:39:53,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▍                                                                         | 82/1189 [06:05<1:20:54,  4.38s/it]g-point operations will not be computed-28 16:39:53,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▌                                                                         | 83/1189 [06:09<1:19:56,  4.34s/it]g-point operations will not be computed-28 16:39:58,012 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▌                                                                         | 83/1189 [06:09<1:19:56,  4.34s/it]g-point operations will not be computed-28 16:39:58,012 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▌                                                                         | 83/1189 [06:09<1:19:56,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:02,174 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▌                                                                         | 83/1189 [06:09<1:19:56,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:02,174 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▌                                                                         | 84/1189 [06:14<1:18:24,  4.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:06,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▋                                                                         | 85/1189 [06:18<1:17:22,  4.21s/it]g-point operations will not be computed-28 16:40:06,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▋                                                                         | 85/1189 [06:18<1:17:22,  4.21s/it]g-point operations will not be computed-28 16:40:06,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▋                                                                         | 85/1189 [06:18<1:17:22,  4.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:10,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▋                                                                         | 86/1189 [06:22<1:16:34,  4.17s/it]g-point operations will not be computed-28 16:40:10,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▋                                                                         | 86/1189 [06:22<1:16:34,  4.17s/it]g-point operations will not be computed-28 16:40:10,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▋                                                                         | 86/1189 [06:22<1:16:34,  4.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:14,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▊                                                                         | 87/1189 [06:26<1:14:51,  4.08s/it]g-point operations will not be computed-28 16:40:14,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▊                                                                         | 87/1189 [06:26<1:14:51,  4.08s/it]g-point operations will not be computed-28 16:40:14,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▊                                                                         | 87/1189 [06:26<1:14:51,  4.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:18,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▊                                                                         | 88/1189 [06:29<1:13:01,  3.98s/it]g-point operations will not be computed-28 16:40:18,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▊                                                                         | 88/1189 [06:29<1:13:01,  3.98s/it]g-point operations will not be computed-28 16:40:18,139 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▊                                                                         | 88/1189 [06:29<1:13:01,  3.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:21,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▉                                                                         | 89/1189 [06:33<1:12:07,  3.93s/it]g-point operations will not be computed-28 16:40:21,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▉                                                                         | 89/1189 [06:33<1:12:07,  3.93s/it]g-point operations will not be computed-28 16:40:21,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  7%|█████▉                                                                         | 89/1189 [06:33<1:12:07,  3.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:25,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|█████▉                                                                         | 90/1189 [06:37<1:09:25,  3.79s/it]g-point operations will not be computed-28 16:40:25,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|█████▉                                                                         | 90/1189 [06:37<1:09:25,  3.79s/it]g-point operations will not be computed-28 16:40:25,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████                                                                         | 91/1189 [06:40<1:06:23,  3.63s/it]g-point operations will not be computed-28 16:40:28,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████                                                                         | 91/1189 [06:40<1:06:23,  3.63s/it]g-point operations will not be computed-28 16:40:28,917 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████                                                                         | 91/1189 [06:40<1:06:23,  3.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:32,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████                                                                         | 91/1189 [06:40<1:06:23,  3.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:32,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████                                                                         | 92/1189 [06:43<1:03:43,  3.48s/it]g-point operations will not be computed-28 16:40:32,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████                                                                         | 92/1189 [06:43<1:03:43,  3.48s/it]g-point operations will not be computed-28 16:40:32,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:40:36,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:40:35,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:40:36,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:40:35,142 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▏                                                                        | 93/1189 [06:46<1:00:25,  3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:37,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▏                                                                        | 93/1189 [06:46<1:00:25,  3.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:37,879 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▍                                                                          | 94/1189 [06:49<56:37,  3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:40,404 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▍                                                                          | 95/1189 [06:51<52:45,  2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:42,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▍                                                                          | 95/1189 [06:51<52:45,  2.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:42,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8917, 'learning_rate': 1.8200000000000002e-06, 'epoch': 0.08}
+  8%|██████▌                                                                          | 96/1189 [06:53<48:32,  2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:44,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▌                                                                          | 96/1189 [06:53<48:32,  2.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:44,696 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▌                                                                          | 97/1189 [06:55<44:44,  2.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:46,677 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▌                                                                          | 97/1189 [06:55<44:44,  2.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:46,677 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.8917, 'learning_rate': 1.8800000000000002e-06, 'epoch': 0.08}
+  8%|██████▋                                                                          | 99/1189 [06:58<37:15,  2.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:48,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:40:49,011 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▋                                                                          | 99/1189 [06:58<37:15,  2.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:48,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▋                                                                         | 100/1189 [07:00<36:21,  2.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:49,745 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:40:50,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▋                                                                         | 100/1189 [07:00<36:21,  2.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:53,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▋                                                                         | 100/1189 [07:00<36:21,  2.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:53,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▊                                                                         | 101/1189 [07:06<55:41,  3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:53,820 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▊                                                                         | 101/1189 [07:06<55:41,  3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:59,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  8%|██████▊                                                                         | 101/1189 [07:06<55:41,  3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:59,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▋                                                                       | 102/1189 [07:11<1:08:10,  3.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:40:59,211 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▋                                                                       | 102/1189 [07:11<1:08:10,  3.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:04,533 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▋                                                                       | 102/1189 [07:11<1:08:10,  3.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:04,533 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▊                                                                       | 103/1189 [07:17<1:16:28,  4.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:04,533 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▊                                                                       | 103/1189 [07:17<1:16:28,  4.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:09,782 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▊                                                                       | 103/1189 [07:17<1:16:28,  4.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:09,782 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▊                                                                       | 104/1189 [07:22<1:22:16,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:09,782 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▊                                                                       | 104/1189 [07:22<1:22:16,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:15,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▉                                                                       | 105/1189 [07:27<1:26:17,  4.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:15,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▉                                                                       | 105/1189 [07:27<1:26:17,  4.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:15,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▉                                                                       | 105/1189 [07:27<1:26:17,  4.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:20,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▉                                                                       | 105/1189 [07:27<1:26:17,  4.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:20,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▉                                                                       | 106/1189 [07:32<1:28:24,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:20,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|██████▉                                                                       | 106/1189 [07:32<1:28:24,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:25,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████                                                                       | 107/1189 [07:37<1:29:29,  4.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:25,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████                                                                       | 107/1189 [07:37<1:29:29,  4.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:25,585 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████                                                                       | 107/1189 [07:37<1:29:29,  4.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:30,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████                                                                       | 107/1189 [07:37<1:29:29,  4.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:30,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████                                                                       | 108/1189 [07:43<1:29:59,  4.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:30,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████                                                                       | 108/1189 [07:43<1:29:59,  4.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:35,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▏                                                                      | 109/1189 [07:48<1:30:19,  5.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:35,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▏                                                                      | 109/1189 [07:48<1:30:19,  5.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:35,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▏                                                                      | 109/1189 [07:48<1:30:19,  5.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:40,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▏                                                                      | 109/1189 [07:48<1:30:19,  5.02s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:40,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▏                                                                      | 110/1189 [07:53<1:29:40,  4.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:40,711 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▏                                                                      | 110/1189 [07:53<1:29:40,  4.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:45,587 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▎                                                                      | 111/1189 [07:57<1:28:50,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:45,587 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▎                                                                      | 111/1189 [07:57<1:28:50,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:45,587 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▎                                                                      | 111/1189 [07:57<1:28:50,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:50,440 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▎                                                                      | 112/1189 [08:02<1:28:22,  4.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:50,440 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▎                                                                      | 112/1189 [08:02<1:28:22,  4.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:50,440 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▎                                                                      | 112/1189 [08:02<1:28:22,  4.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:55,303 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+  9%|███████▎                                                                      | 112/1189 [08:02<1:28:22,  4.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:55,303 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▍                                                                      | 113/1189 [08:07<1:28:09,  4.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:41:55,303 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▍                                                                      | 113/1189 [08:07<1:28:09,  4.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:00,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▍                                                                      | 114/1189 [08:12<1:28:34,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:00,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▍                                                                      | 114/1189 [08:12<1:28:34,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:00,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▍                                                                      | 114/1189 [08:12<1:28:34,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:05,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▍                                                                      | 114/1189 [08:12<1:28:34,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:05,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▌                                                                      | 115/1189 [08:17<1:28:40,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:05,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▌                                                                      | 115/1189 [08:17<1:28:40,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:10,198 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▌                                                                      | 116/1189 [08:22<1:28:28,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:10,198 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▌                                                                      | 116/1189 [08:22<1:28:28,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:10,198 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▌                                                                      | 116/1189 [08:22<1:28:28,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:15,186 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▋                                                                      | 117/1189 [08:27<1:28:14,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:15,186 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▋                                                                      | 117/1189 [08:27<1:28:14,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:15,186 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▋                                                                      | 117/1189 [08:27<1:28:14,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:20,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▋                                                                      | 117/1189 [08:27<1:28:14,  4.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:20,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▋                                                                      | 118/1189 [08:32<1:27:26,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:20,025 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▋                                                                      | 118/1189 [08:32<1:27:26,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:24,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▊                                                                      | 119/1189 [08:37<1:27:00,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:24,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▊                                                                      | 119/1189 [08:37<1:27:00,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:24,844 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▊                                                                      | 119/1189 [08:37<1:27:00,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:29,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▊                                                                      | 119/1189 [08:37<1:27:00,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:29,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▊                                                                      | 120/1189 [08:41<1:26:13,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:29,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▊                                                                      | 120/1189 [08:41<1:26:13,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:34,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▊                                                                      | 120/1189 [08:41<1:26:13,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:34,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▉                                                                      | 121/1189 [08:46<1:25:31,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:34,376 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|███████▉                                                                      | 121/1189 [08:46<1:25:31,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:39,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████                                                                      | 122/1189 [08:51<1:24:36,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:39,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████                                                                      | 122/1189 [08:51<1:24:36,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:39,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████                                                                      | 122/1189 [08:51<1:24:36,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:43,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████                                                                      | 123/1189 [08:55<1:23:55,  4.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:43,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████                                                                      | 123/1189 [08:55<1:23:55,  4.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:43,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████                                                                      | 123/1189 [08:55<1:23:55,  4.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:48,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████▏                                                                     | 124/1189 [09:00<1:22:50,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:48,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████▏                                                                     | 124/1189 [09:00<1:22:50,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:48,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 10%|████████▏                                                                     | 124/1189 [09:00<1:22:50,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:52,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▏                                                                     | 125/1189 [09:04<1:22:20,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:52,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▏                                                                     | 125/1189 [09:04<1:22:20,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:52,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▏                                                                     | 125/1189 [09:04<1:22:20,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:57,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▎                                                                     | 126/1189 [09:09<1:21:42,  4.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:57,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▎                                                                     | 126/1189 [09:09<1:21:42,  4.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:42:57,442 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▎                                                                     | 126/1189 [09:09<1:21:42,  4.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▎                                                                     | 127/1189 [09:14<1:20:56,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▎                                                                     | 127/1189 [09:14<1:20:56,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:01,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▎                                                                     | 127/1189 [09:14<1:20:56,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:06,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▍                                                                     | 128/1189 [09:18<1:20:15,  4.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:06,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▍                                                                     | 128/1189 [09:18<1:20:15,  4.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:06,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▍                                                                     | 128/1189 [09:18<1:20:15,  4.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:10,852 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▍                                                                     | 129/1189 [09:22<1:19:43,  4.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:10,852 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▍                                                                     | 129/1189 [09:22<1:19:43,  4.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:10,852 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▍                                                                     | 129/1189 [09:22<1:19:43,  4.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:15,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▌                                                                     | 130/1189 [09:27<1:18:54,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:15,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▌                                                                     | 130/1189 [09:27<1:18:54,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:15,275 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▌                                                                     | 130/1189 [09:27<1:18:54,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:19,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▌                                                                     | 130/1189 [09:27<1:18:54,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:19,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▌                                                                     | 131/1189 [09:31<1:18:01,  4.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:19,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▌                                                                     | 131/1189 [09:31<1:18:01,  4.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:23,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▋                                                                     | 132/1189 [09:35<1:17:18,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:23,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▋                                                                     | 132/1189 [09:35<1:17:18,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:23,897 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▋                                                                     | 132/1189 [09:35<1:17:18,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:28,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▋                                                                     | 133/1189 [09:40<1:16:21,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:28,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▋                                                                     | 133/1189 [09:40<1:16:21,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:28,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▋                                                                     | 133/1189 [09:40<1:16:21,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:32,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▊                                                                     | 134/1189 [09:44<1:14:46,  4.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:32,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▊                                                                     | 134/1189 [09:44<1:14:46,  4.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:32,327 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▊                                                                     | 134/1189 [09:44<1:14:46,  4.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:36,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▊                                                                     | 135/1189 [09:48<1:13:18,  4.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:36,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▊                                                                     | 135/1189 [09:48<1:13:18,  4.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:36,375 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▊                                                                     | 135/1189 [09:48<1:13:18,  4.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:40,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▉                                                                     | 136/1189 [09:52<1:11:55,  4.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:40,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▉                                                                     | 136/1189 [09:52<1:11:55,  4.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:40,315 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 11%|████████▉                                                                     | 136/1189 [09:52<1:11:55,  4.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:44,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|████████▉                                                                     | 137/1189 [09:56<1:10:58,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:44,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|████████▉                                                                     | 137/1189 [09:56<1:10:58,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:44,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|████████▉                                                                     | 137/1189 [09:56<1:10:58,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:48,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████                                                                     | 138/1189 [09:59<1:09:48,  3.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:48,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████                                                                     | 138/1189 [09:59<1:09:48,  3.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:48,125 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████                                                                     | 139/1189 [10:03<1:08:49,  3.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:51,887 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████                                                                     | 139/1189 [10:03<1:08:49,  3.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:55,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████                                                                     | 139/1189 [10:03<1:08:49,  3.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:55,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▏                                                                    | 140/1189 [10:07<1:06:39,  3.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:55,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▏                                                                    | 140/1189 [10:07<1:06:39,  3.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:55,616 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▏                                                                    | 140/1189 [10:07<1:06:39,  3.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:59,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▏                                                                    | 141/1189 [10:10<1:04:00,  3.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:59,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▏                                                                    | 141/1189 [10:10<1:04:00,  3.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:43:59,106 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▎                                                                    | 142/1189 [10:13<1:00:46,  3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:02,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▎                                                                    | 142/1189 [10:13<1:00:46,  3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:02,250 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▎                                                                    | 142/1189 [10:13<1:00:46,  3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:05,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▎                                                                    | 142/1189 [10:13<1:00:46,  3.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:05,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▌                                                                      | 143/1189 [10:16<57:29,  3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:07,947 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▌                                                                      | 143/1189 [10:16<57:29,  3.30s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:07,947 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▋                                                                      | 144/1189 [10:19<53:56,  3.10s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:07,947 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▊                                                                      | 145/1189 [10:21<50:10,  2.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:10,478 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▊                                                                      | 145/1189 [10:21<50:10,  2.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:10,478 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▊                                                                      | 146/1189 [10:23<46:34,  2.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:12,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▊                                                                      | 146/1189 [10:23<46:34,  2.68s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:12,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▉                                                                      | 147/1189 [10:25<42:47,  2.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:14,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 12%|█████████▉                                                                      | 147/1189 [10:25<42:47,  2.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:14,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.444, 'learning_rate': 2.86e-06, 'epoch': 0.12}
+ 12%|█████████▉                                                                      | 148/1189 [10:27<38:52,  2.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:16,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                      | 149/1189 [10:28<34:52,  2.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:19,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                      | 149/1189 [10:28<34:52,  2.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:19,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                      | 150/1189 [10:30<33:53,  1.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:19,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                      | 150/1189 [10:30<33:53,  1.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:19,650 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                      | 150/1189 [10:30<33:53,  1.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:23,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                      | 150/1189 [10:30<33:53,  1.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:23,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                     | 151/1189 [10:36<52:40,  3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:23,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                     | 151/1189 [10:36<52:40,  3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:29,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                     | 151/1189 [10:36<52:40,  3.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:29,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|█████████▉                                                                    | 152/1189 [10:41<1:04:24,  3.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:29,056 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|█████████▉                                                                    | 152/1189 [10:41<1:04:24,  3.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:34,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|█████████▉                                                                    | 152/1189 [10:41<1:04:24,  3.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:34,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                    | 153/1189 [10:46<1:11:57,  4.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:34,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                    | 153/1189 [10:46<1:11:57,  4.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:39,554 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                    | 153/1189 [10:46<1:11:57,  4.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:39,554 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                    | 154/1189 [10:52<1:17:36,  4.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:39,554 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                    | 154/1189 [10:52<1:17:36,  4.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:44,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████                                                                    | 154/1189 [10:52<1:17:36,  4.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:44,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                   | 155/1189 [10:57<1:20:32,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:44,725 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                   | 155/1189 [10:57<1:20:32,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:49,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                   | 155/1189 [10:57<1:20:32,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:49,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                   | 156/1189 [11:02<1:22:42,  4.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:49,770 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▏                                                                   | 156/1189 [11:02<1:22:42,  4.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:54,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▎                                                                   | 157/1189 [11:07<1:23:28,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:54,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▎                                                                   | 157/1189 [11:07<1:23:28,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:54,882 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▎                                                                   | 157/1189 [11:07<1:23:28,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:59,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▎                                                                   | 157/1189 [11:07<1:23:28,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:59,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▎                                                                   | 158/1189 [11:12<1:24:07,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:44:59,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▎                                                                   | 158/1189 [11:12<1:24:07,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▎                                                                   | 158/1189 [11:12<1:24:07,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▍                                                                   | 159/1189 [11:17<1:25:01,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▍                                                                   | 159/1189 [11:17<1:25:01,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▍                                                                   | 159/1189 [11:17<1:25:01,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▍                                                                   | 160/1189 [11:22<1:24:57,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 13%|██████████▍                                                                   | 160/1189 [11:22<1:24:57,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▌                                                                   | 161/1189 [11:27<1:24:45,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▌                                                                   | 161/1189 [11:27<1:24:45,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3716, 'learning_rate': 3.1400000000000004e-06, 'epoch': 0.14}
+ 14%|██████████▌                                                                   | 161/1189 [11:27<1:24:45,  4.95s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▋                                                                   | 162/1189 [11:32<1:24:19,  4.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▋                                                                   | 162/1189 [11:32<1:24:19,  4.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▋                                                                   | 162/1189 [11:32<1:24:19,  4.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▋                                                                   | 163/1189 [11:36<1:23:51,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▋                                                                   | 163/1189 [11:36<1:23:51,  4.90s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▊                                                                   | 164/1189 [11:41<1:23:31,  4.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▊                                                                   | 164/1189 [11:41<1:23:31,  4.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0612, 'learning_rate': 3.2000000000000003e-06, 'epoch': 0.14}
+ 14%|██████████▊                                                                   | 164/1189 [11:41<1:23:31,  4.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▊                                                                   | 165/1189 [11:46<1:22:41,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▊                                                                   | 165/1189 [11:46<1:22:41,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▊                                                                   | 165/1189 [11:46<1:22:41,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▉                                                                   | 166/1189 [11:51<1:22:07,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|██████████▉                                                                   | 166/1189 [11:51<1:22:07,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3229, 'learning_rate': 3.2600000000000006e-06, 'epoch': 0.14}
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████                                                                   | 168/1189 [12:00<1:21:00,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████                                                                   | 168/1189 [12:00<1:21:00,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████                                                                   | 168/1189 [12:00<1:21:00,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████                                                                   | 169/1189 [12:05<1:20:40,  4.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████                                                                   | 169/1189 [12:05<1:20:40,  4.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████                                                                   | 169/1189 [12:05<1:20:40,  4.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████▏                                                                  | 170/1189 [12:09<1:19:49,  4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████▏                                                                  | 170/1189 [12:09<1:19:49,  4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████▏                                                                  | 171/1189 [12:14<1:19:11,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████▏                                                                  | 171/1189 [12:14<1:19:11,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3623, 'learning_rate': 3.3400000000000006e-06, 'epoch': 0.14}
+ 14%|███████████▏                                                                  | 171/1189 [12:14<1:19:11,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████▎                                                                  | 172/1189 [12:19<1:18:47,  4.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████▎                                                                  | 172/1189 [12:19<1:18:47,  4.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 14%|███████████▎                                                                  | 172/1189 [12:19<1:18:47,  4.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▎                                                                  | 173/1189 [12:23<1:18:19,  4.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▎                                                                  | 173/1189 [12:23<1:18:19,  4.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▎                                                                  | 173/1189 [12:23<1:18:19,  4.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▍                                                                  | 174/1189 [12:28<1:18:00,  4.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▍                                                                  | 174/1189 [12:28<1:18:00,  4.61s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▍                                                                  | 175/1189 [12:32<1:17:03,  4.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▍                                                                  | 175/1189 [12:32<1:17:03,  4.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3485, 'learning_rate': 3.4200000000000007e-06, 'epoch': 0.15}
+ 15%|███████████▍                                                                  | 175/1189 [12:32<1:17:03,  4.56s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▌                                                                  | 176/1189 [12:37<1:16:10,  4.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▌                                                                  | 176/1189 [12:37<1:16:10,  4.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▌                                                                  | 177/1189 [12:41<1:15:31,  4.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▌                                                                  | 177/1189 [12:41<1:15:31,  4.48s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4658, 'learning_rate': 3.46e-06, 'epoch': 0.15}
+ 15%|███████████▋                                                                  | 178/1189 [12:45<1:14:51,  4.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▋                                                                  | 178/1189 [12:45<1:14:51,  4.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2413, 'learning_rate': 3.48e-06, 'epoch': 0.15}
+ 15%|███████████▋                                                                  | 179/1189 [12:50<1:14:36,  4.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▋                                                                  | 179/1189 [12:50<1:14:36,  4.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2862, 'learning_rate': 3.5e-06, 'epoch': 0.15}
+ 15%|███████████▊                                                                  | 180/1189 [12:54<1:14:03,  4.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▊                                                                  | 180/1189 [12:54<1:14:03,  4.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4141, 'learning_rate': 3.52e-06, 'epoch': 0.15}
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3774, 'learning_rate': 3.54e-06, 'epoch': 0.15}
+ 15%|███████████▉                                                                  | 182/1189 [13:03<1:12:36,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|███████████▉                                                                  | 182/1189 [13:03<1:12:36,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4269, 'learning_rate': 3.5600000000000002e-06, 'epoch': 0.15}
+ 15%|████████████                                                                  | 183/1189 [13:07<1:11:48,  4.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|████████████                                                                  | 183/1189 [13:07<1:11:48,  4.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2487, 'learning_rate': 3.58e-06, 'epoch': 0.15}
+ 15%|████████████                                                                  | 184/1189 [13:11<1:10:41,  4.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 15%|████████████                                                                  | 184/1189 [13:11<1:10:41,  4.22s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3575, 'learning_rate': 3.6000000000000003e-06, 'epoch': 0.15}
+ 16%|████████████▏                                                                 | 185/1189 [13:15<1:09:34,  4.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|████████████▏                                                                 | 185/1189 [13:15<1:09:34,  4.16s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1065, 'learning_rate': 3.62e-06, 'epoch': 0.16}
+ 16%|████████████▏                                                                 | 186/1189 [13:19<1:08:07,  4.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|████████████▏                                                                 | 186/1189 [13:19<1:08:07,  4.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2595, 'learning_rate': 3.6400000000000003e-06, 'epoch': 0.16}
+ 16%|████████████▎                                                                 | 187/1189 [13:23<1:06:45,  4.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|████████████▎                                                                 | 187/1189 [13:23<1:06:45,  4.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.5998, 'learning_rate': 3.66e-06, 'epoch': 0.16}
+ 16%|████████████▎                                                                 | 188/1189 [13:26<1:04:50,  3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|████████████▎                                                                 | 188/1189 [13:26<1:04:50,  3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3651, 'learning_rate': 3.6800000000000003e-06, 'epoch': 0.16}
+ 16%|████████████▍                                                                 | 189/1189 [13:30<1:02:55,  3.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|████████████▍                                                                 | 189/1189 [13:30<1:02:55,  3.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:47:23,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:47:23,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1233, 'learning_rate': 3.7200000000000004e-06, 'epoch': 0.16}
+ 16%|████████████▊                                                                   | 191/1189 [13:36<57:30,  3.46s/it]g-point operations will not be computed-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|████████████▊                                                                   | 191/1189 [13:36<57:30,  3.46s/it]g-point operations will not be computed-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:47:29,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:47:29,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.569, 'learning_rate': 3.7600000000000004e-06, 'epoch': 0.16}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:47:29,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:45:04,893 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|████████████▉                                                                   | 193/1189 [13:42<51:32,  3.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:33,620 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|█████████████                                                                   | 194/1189 [13:44<48:13,  2.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:35,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|█████████████                                                                   | 194/1189 [13:44<48:13,  2.91s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:35,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|█████████████                                                                   | 195/1189 [13:46<45:07,  2.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:38,214 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|█████████████                                                                   | 195/1189 [13:46<45:07,  2.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:38,214 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|█████████████▏                                                                  | 196/1189 [13:49<42:15,  2.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:40,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 16%|█████████████▏                                                                  | 196/1189 [13:49<42:15,  2.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:40,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                  | 197/1189 [13:50<38:58,  2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:42,048 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                  | 197/1189 [13:50<38:58,  2.36s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:42,048 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                  | 198/1189 [13:52<35:52,  2.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:43,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                  | 198/1189 [13:52<35:52,  2.17s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:43,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▍                                                                  | 200/1189 [13:56<31:42,  1.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:45,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▍                                                                  | 200/1189 [13:56<31:42,  1.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:45,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.588, 'learning_rate': 3.900000000000001e-06, 'epoch': 0.17}
+ 17%|█████████████▍                                                                  | 200/1189 [13:56<31:42,  1.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▌                                                                  | 201/1189 [14:01<49:10,  2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▌                                                                  | 201/1189 [14:01<49:10,  2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3844, 'learning_rate': 3.94e-06, 'epoch': 0.17}
+ 17%|█████████████▌                                                                  | 201/1189 [14:01<49:10,  2.99s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                | 202/1189 [14:06<1:00:12,  3.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                | 202/1189 [14:06<1:00:12,  3.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                | 203/1189 [14:11<1:06:57,  4.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▎                                                                | 203/1189 [14:11<1:06:57,  4.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2721, 'learning_rate': 3.980000000000001e-06, 'epoch': 0.17}
+ 17%|█████████████▍                                                                | 204/1189 [14:16<1:11:18,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▍                                                                | 204/1189 [14:16<1:11:18,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2226, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.17}
+ 17%|█████████████▍                                                                | 205/1189 [14:21<1:14:23,  4.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▍                                                                | 205/1189 [14:21<1:14:23,  4.54s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1393, 'learning_rate': 4.0200000000000005e-06, 'epoch': 0.17}
+ 17%|█████████████▌                                                                | 206/1189 [14:26<1:16:18,  4.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▌                                                                | 206/1189 [14:26<1:16:18,  4.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2505, 'learning_rate': 4.04e-06, 'epoch': 0.17}
+ 17%|█████████████▌                                                                | 207/1189 [14:31<1:18:03,  4.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▌                                                                | 207/1189 [14:31<1:18:03,  4.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3549, 'learning_rate': 4.060000000000001e-06, 'epoch': 0.17}
+ 17%|█████████████▋                                                                | 208/1189 [14:36<1:18:55,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▋                                                                | 208/1189 [14:36<1:18:55,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.5281, 'learning_rate': 4.08e-06, 'epoch': 0.17}
+ 17%|█████████████▋                                                                | 208/1189 [14:36<1:18:55,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▋                                                                | 208/1189 [14:36<1:18:55,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4086, 'learning_rate': 4.1e-06, 'epoch': 0.18}
+ 17%|█████████████▋                                                                | 208/1189 [14:36<1:18:55,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 17%|█████████████▋                                                                | 208/1189 [14:36<1:18:55,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▊                                                                | 210/1189 [14:46<1:19:22,  4.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▊                                                                | 210/1189 [14:46<1:19:22,  4.86s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▊                                                                | 211/1189 [14:51<1:19:31,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▊                                                                | 211/1189 [14:51<1:19:31,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2584, 'learning_rate': 4.14e-06, 'epoch': 0.18}
+ 18%|█████████████▊                                                                | 211/1189 [14:51<1:19:31,  4.88s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▉                                                                | 212/1189 [14:56<1:19:01,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▉                                                                | 212/1189 [14:56<1:19:01,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▉                                                                | 212/1189 [14:56<1:19:01,  4.85s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▉                                                                | 213/1189 [15:00<1:18:39,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▉                                                                | 213/1189 [15:00<1:18:39,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|█████████████▉                                                                | 213/1189 [15:00<1:18:39,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████                                                                | 214/1189 [15:05<1:17:52,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████                                                                | 214/1189 [15:05<1:17:52,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████                                                                | 214/1189 [15:05<1:17:52,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████                                                                | 215/1189 [15:10<1:17:46,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████                                                                | 215/1189 [15:10<1:17:46,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████                                                                | 215/1189 [15:10<1:17:46,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▏                                                               | 216/1189 [15:15<1:17:05,  4.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▏                                                               | 216/1189 [15:15<1:17:05,  4.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▏                                                               | 216/1189 [15:15<1:17:05,  4.75s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▏                                                               | 217/1189 [15:19<1:16:26,  4.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▏                                                               | 217/1189 [15:19<1:16:26,  4.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▎                                                               | 218/1189 [15:24<1:16:14,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▎                                                               | 218/1189 [15:24<1:16:14,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2388, 'learning_rate': 4.2800000000000005e-06, 'epoch': 0.18}
+ 18%|██████████████▎                                                               | 218/1189 [15:24<1:16:14,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▎                                                               | 219/1189 [15:29<1:15:53,  4.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▎                                                               | 219/1189 [15:29<1:15:53,  4.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 18%|██████████████▎                                                               | 219/1189 [15:29<1:15:53,  4.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▍                                                               | 220/1189 [15:33<1:15:18,  4.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▍                                                               | 220/1189 [15:33<1:15:18,  4.66s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▍                                                               | 221/1189 [15:38<1:14:15,  4.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▍                                                               | 221/1189 [15:38<1:14:15,  4.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2568, 'learning_rate': 4.34e-06, 'epoch': 0.19}
+ 19%|██████████████▌                                                               | 222/1189 [15:42<1:13:46,  4.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▌                                                               | 222/1189 [15:42<1:13:46,  4.58s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.566, 'learning_rate': 4.360000000000001e-06, 'epoch': 0.19}
+ 19%|██████████████▋                                                               | 223/1189 [15:47<1:13:11,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▋                                                               | 223/1189 [15:47<1:13:11,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3124, 'learning_rate': 4.38e-06, 'epoch': 0.19}
+ 19%|██████████████▋                                                               | 223/1189 [15:47<1:13:11,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▋                                                               | 223/1189 [15:47<1:13:11,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3912, 'learning_rate': 4.4e-06, 'epoch': 0.19}
+ 19%|██████████████▋                                                               | 223/1189 [15:47<1:13:11,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▋                                                               | 223/1189 [15:47<1:13:11,  4.55s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▊                                                               | 225/1189 [15:55<1:11:37,  4.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▊                                                               | 225/1189 [15:55<1:11:37,  4.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▊                                                               | 226/1189 [16:00<1:11:16,  4.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▊                                                               | 226/1189 [16:00<1:11:16,  4.44s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.109, 'learning_rate': 4.440000000000001e-06, 'epoch': 0.19}
+ 19%|██████████████▉                                                               | 227/1189 [16:04<1:10:19,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▉                                                               | 227/1189 [16:04<1:10:19,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2459, 'learning_rate': 4.4600000000000005e-06, 'epoch': 0.19}
+ 19%|██████████████▉                                                               | 227/1189 [16:04<1:10:19,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▉                                                               | 228/1189 [16:08<1:10:01,  4.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|██████████████▉                                                               | 228/1189 [16:08<1:10:01,  4.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|███████████████                                                               | 229/1189 [16:13<1:09:07,  4.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|███████████████                                                               | 229/1189 [16:13<1:09:07,  4.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1719, 'learning_rate': 4.5e-06, 'epoch': 0.19}
+ 19%|███████████████                                                               | 230/1189 [16:17<1:08:28,  4.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|███████████████                                                               | 230/1189 [16:17<1:08:28,  4.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3651, 'learning_rate': 4.520000000000001e-06, 'epoch': 0.19}
+ 19%|███████████████▏                                                              | 231/1189 [16:21<1:07:51,  4.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 19%|███████████████▏                                                              | 231/1189 [16:21<1:07:51,  4.25s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3436, 'learning_rate': 4.540000000000001e-06, 'epoch': 0.19}
+ 20%|███████████████▏                                                              | 232/1189 [16:25<1:06:59,  4.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▏                                                              | 232/1189 [16:25<1:06:59,  4.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3034, 'learning_rate': 4.56e-06, 'epoch': 0.2}
+ 20%|███████████████▎                                                              | 233/1189 [16:29<1:06:45,  4.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▎                                                              | 233/1189 [16:29<1:06:45,  4.19s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2627, 'learning_rate': 4.58e-06, 'epoch': 0.2}
+ 20%|███████████████▎                                                              | 234/1189 [16:33<1:05:33,  4.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▎                                                              | 234/1189 [16:33<1:05:33,  4.12s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2283, 'learning_rate': 4.600000000000001e-06, 'epoch': 0.2}
+ 20%|███████████████▍                                                              | 235/1189 [16:37<1:04:34,  4.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▍                                                              | 235/1189 [16:37<1:04:34,  4.06s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2872, 'learning_rate': 4.620000000000001e-06, 'epoch': 0.2}
+ 20%|███████████████▍                                                              | 236/1189 [16:41<1:03:39,  4.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▍                                                              | 236/1189 [16:41<1:03:39,  4.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4579, 'learning_rate': 4.6400000000000005e-06, 'epoch': 0.2}
+ 20%|███████████████▌                                                              | 237/1189 [16:45<1:02:45,  3.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▌                                                              | 237/1189 [16:45<1:02:45,  3.96s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1832, 'learning_rate': 4.66e-06, 'epoch': 0.2}
+ 20%|███████████████▌                                                              | 238/1189 [16:49<1:01:36,  3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▌                                                              | 238/1189 [16:49<1:01:36,  3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3195, 'learning_rate': 4.680000000000001e-06, 'epoch': 0.2}
+ 20%|███████████████▌                                                              | 238/1189 [16:49<1:01:36,  3.89s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|███████████████▋                                                              | 239/1189 [16:52<1:00:20,  3.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:50:46,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:50:46,350 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.341, 'learning_rate': 4.7200000000000005e-06, 'epoch': 0.2}
+ 20%|████████████████▏                                                               | 241/1189 [16:59<57:37,  3.65s/it]g-point operations will not be computed-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|████████████████▏                                                               | 241/1189 [16:59<57:37,  3.65s/it]g-point operations will not be computed-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4422, 'learning_rate': 4.74e-06, 'epoch': 0.2}
+ 20%|████████████████▏                                                               | 241/1189 [16:59<57:37,  3.65s/it]g-point operations will not be computed-28 16:47:48,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|████████████████▎                                                               | 242/1189 [17:02<55:30,  3.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:50:54,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|████████████████▎                                                               | 243/1189 [17:06<53:26,  3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:50:54,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 20%|████████████████▎                                                               | 243/1189 [17:06<53:26,  3.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:50:54,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:50:58,967 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:50:54,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:50:58,967 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:50:54,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3036, 'learning_rate': 4.800000000000001e-06, 'epoch': 0.21}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:50:58,967 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:50:54,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▍                                                               | 245/1189 [17:11<47:40,  3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:02,782 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▌                                                               | 246/1189 [17:13<44:14,  2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:04,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▌                                                               | 246/1189 [17:13<44:14,  2.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:04,974 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▌                                                               | 247/1189 [17:15<40:41,  2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:06,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▌                                                               | 247/1189 [17:15<40:41,  2.59s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:06,921 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.818, 'learning_rate': 4.86e-06, 'epoch': 0.21}
+ 21%|████████████████▋                                                               | 248/1189 [17:17<36:55,  2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:08,572 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▋                                                               | 248/1189 [17:17<36:55,  2.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:08,572 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▊                                                               | 249/1189 [17:19<33:07,  2.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▊                                                               | 249/1189 [17:19<33:07,  2.11s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▊                                                               | 250/1189 [17:21<31:45,  2.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▊                                                               | 250/1189 [17:21<31:45,  2.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▉                                                               | 251/1189 [17:26<47:06,  3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▉                                                               | 251/1189 [17:26<47:06,  3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3932, 'learning_rate': 4.94e-06, 'epoch': 0.21}
+ 21%|████████████████▉                                                               | 251/1189 [17:26<47:06,  3.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▉                                                               | 252/1189 [17:31<57:00,  3.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▉                                                               | 252/1189 [17:31<57:00,  3.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▌                                                             | 253/1189 [17:36<1:03:15,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▌                                                             | 253/1189 [17:36<1:03:15,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3453, 'learning_rate': 4.980000000000001e-06, 'epoch': 0.21}
+ 21%|████████████████▌                                                             | 253/1189 [17:36<1:03:15,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▌                                                             | 253/1189 [17:36<1:03:15,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.077, 'learning_rate': 5e-06, 'epoch': 0.21}
+ 21%|████████████████▌                                                             | 253/1189 [17:36<1:03:15,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▋                                                             | 255/1189 [17:46<1:10:28,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▋                                                             | 255/1189 [17:46<1:10:28,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0752, 'learning_rate': 5.02e-06, 'epoch': 0.21}
+ 21%|████████████████▋                                                             | 255/1189 [17:46<1:10:28,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 21%|████████████████▋                                                             | 255/1189 [17:46<1:10:28,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2665, 'learning_rate': 5.04e-06, 'epoch': 0.22}
+ 21%|████████████████▋                                                             | 255/1189 [17:46<1:10:28,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|████████████████▊                                                             | 257/1189 [17:56<1:13:33,  4.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|████████████████▊                                                             | 257/1189 [17:56<1:13:33,  4.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0714, 'learning_rate': 5.060000000000001e-06, 'epoch': 0.22}
+ 22%|████████████████▉                                                             | 258/1189 [18:01<1:13:52,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|████████████████▉                                                             | 258/1189 [18:01<1:13:52,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.5038, 'learning_rate': 5.0800000000000005e-06, 'epoch': 0.22}
+ 22%|████████████████▉                                                             | 258/1189 [18:01<1:13:52,  4.76s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|████████████████▉                                                             | 259/1189 [18:06<1:14:20,  4.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|████████████████▉                                                             | 259/1189 [18:06<1:14:20,  4.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████                                                             | 260/1189 [18:10<1:14:48,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████                                                             | 260/1189 [18:10<1:14:48,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2644, 'learning_rate': 5.12e-06, 'epoch': 0.22}
+ 22%|█████████████████                                                             | 261/1189 [18:15<1:14:40,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████                                                             | 261/1189 [18:15<1:14:40,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2386, 'learning_rate': 5.140000000000001e-06, 'epoch': 0.22}
+ 22%|█████████████████▏                                                            | 262/1189 [18:20<1:14:26,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████▏                                                            | 262/1189 [18:20<1:14:26,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2262, 'learning_rate': 5.1600000000000006e-06, 'epoch': 0.22}
+ 22%|█████████████████▎                                                            | 263/1189 [18:25<1:14:27,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████▎                                                            | 263/1189 [18:25<1:14:27,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3571, 'learning_rate': 5.18e-06, 'epoch': 0.22}
+ 22%|█████████████████▎                                                            | 264/1189 [18:30<1:14:09,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████▎                                                            | 264/1189 [18:30<1:14:09,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3429, 'learning_rate': 5.2e-06, 'epoch': 0.22}
+ 22%|█████████████████▍                                                            | 265/1189 [18:34<1:13:31,  4.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████▍                                                            | 265/1189 [18:34<1:13:31,  4.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2446, 'learning_rate': 5.220000000000001e-06, 'epoch': 0.22}
+ 22%|█████████████████▍                                                            | 266/1189 [18:39<1:12:44,  4.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████▍                                                            | 266/1189 [18:39<1:12:44,  4.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3092, 'learning_rate': 5.240000000000001e-06, 'epoch': 0.22}
+ 22%|█████████████████▌                                                            | 267/1189 [18:44<1:12:18,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 22%|█████████████████▌                                                            | 267/1189 [18:44<1:12:18,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1785, 'learning_rate': 5.2600000000000005e-06, 'epoch': 0.22}
+ 23%|█████████████████▌                                                            | 268/1189 [18:48<1:12:07,  4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▌                                                            | 268/1189 [18:48<1:12:07,  4.70s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2174, 'learning_rate': 5.28e-06, 'epoch': 0.23}
+ 23%|█████████████████▋                                                            | 269/1189 [18:53<1:11:39,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|███████████���█████▋                                                            | 269/1189 [18:53<1:11:39,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2146, 'learning_rate': 5.300000000000001e-06, 'epoch': 0.23}
+ 23%|█████████████████▋                                                            | 270/1189 [18:57<1:11:05,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▋                                                            | 270/1189 [18:57<1:11:05,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1565, 'learning_rate': 5.320000000000001e-06, 'epoch': 0.23}
+ 23%|█████████████████▊                                                            | 271/1189 [19:02<1:10:39,  4.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▊                                                            | 271/1189 [19:02<1:10:39,  4.62s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4837, 'learning_rate': 5.3400000000000005e-06, 'epoch': 0.23}
+ 23%|█████████████████▊                                                            | 272/1189 [19:07<1:09:53,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▊                                                            | 272/1189 [19:07<1:09:53,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1145, 'learning_rate': 5.36e-06, 'epoch': 0.23}
+ 23%|█████████████████▊                                                            | 272/1189 [19:07<1:09:53,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▊                                                            | 272/1189 [19:07<1:09:53,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2404, 'learning_rate': 5.380000000000001e-06, 'epoch': 0.23}
+ 23%|█████████████████▊                                                            | 272/1189 [19:07<1:09:53,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▉                                                            | 274/1189 [19:15<1:08:25,  4.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▉                                                            | 274/1189 [19:15<1:08:25,  4.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|█████████████████▉                                                            | 274/1189 [19:15<1:08:25,  4.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████                                                            | 275/1189 [19:20<1:08:01,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████                                                            | 275/1189 [19:20<1:08:01,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|███████████████���██                                                            | 275/1189 [19:20<1:08:01,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████                                                            | 276/1189 [19:24<1:07:15,  4.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████                                                            | 276/1189 [19:24<1:07:15,  4.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████                                                            | 276/1189 [19:24<1:07:15,  4.42s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▏                                                           | 277/1189 [19:28<1:06:44,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▏                                                           | 277/1189 [19:28<1:06:44,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▏                                                           | 277/1189 [19:28<1:06:44,  4.39s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▏                                                           | 278/1189 [19:33<1:05:43,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▏                                                           | 278/1189 [19:33<1:05:43,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▏                                                           | 278/1189 [19:33<1:05:43,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▎                                                           | 279/1189 [19:37<1:04:56,  4.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▎                                                           | 279/1189 [19:37<1:04:56,  4.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 23%|██████████████████▎                                                           | 279/1189 [19:37<1:04:56,  4.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▎                                                           | 280/1189 [19:41<1:04:17,  4.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▎                                                           | 280/1189 [19:41<1:04:17,  4.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▎                                                           | 280/1189 [19:41<1:04:17,  4.24s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▍                                                           | 281/1189 [19:45<1:03:42,  4.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▍                                                           | 281/1189 [19:45<1:03:42,  4.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▍                                                           | 281/1189 [19:45<1:03:42,  4.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▍                                                           | 282/1189 [19:49<1:02:40,  4.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▍                                                           | 282/1189 [19:49<1:02:40,  4.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▍                                                           | 282/1189 [19:49<1:02:40,  4.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▌                                                           | 283/1189 [19:53<1:01:47,  4.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▌                                                           | 283/1189 [19:53<1:01:47,  4.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▌                                                           | 283/1189 [19:53<1:01:47,  4.09s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▋                                                           | 284/1189 [19:57<1:01:06,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▋                                                           | 284/1189 [19:57<1:01:06,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|██████████████████▋                                                           | 284/1189 [19:57<1:01:06,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▏                                                            | 285/1189 [20:01<59:58,  3.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▏                                                            | 285/1189 [20:01<59:58,  3.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▏                                                            | 285/1189 [20:01<59:58,  3.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▏                                                            | 286/1189 [20:05<58:57,  3.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▏                                                            | 286/1189 [20:05<58:57,  3.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▏                                                            | 286/1189 [20:05<58:57,  3.92s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▎                                                            | 287/1189 [20:08<57:39,  3.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:02,321 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:02,321 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2194, 'learning_rate': 5.68e-06, 'epoch': 0.24}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:02,321 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▍                                                            | 289/1189 [20:15<55:08,  3.68s/it]g-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▍                                                            | 289/1189 [20:15<55:08,  3.68s/it]g-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▍                                                            | 289/1189 [20:15<55:08,  3.68s/it]g-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 24%|███████████████████▌                                                            | 290/1189 [20:19<53:43,  3.59s/it]g-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:12,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:12,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3521, 'learning_rate': 5.74e-06, 'epoch': 0.24}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:12,325 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:51:10,003 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|███████████████████▋                                                            | 292/1189 [20:25<49:02,  3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|███████████████████▋                                                            | 292/1189 [20:25<49:02,  3.28s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|███████████████████▋                                                            | 293/1189 [20:27<45:53,  3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|███████████████████▋                                                            | 293/1189 [20:27<45:53,  3.07s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:20,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:20,223 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:22,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:22,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:24,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:24,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:26,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:26,152 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:27,684 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:27,684 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:30,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:30,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.7455, 'learning_rate': 5.92e-06, 'epoch': 0.25}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:30,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:54:30,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▎                                                           | 301/1189 [20:46<42:56,  2.90s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▎                                                           | 301/1189 [20:46<42:56,  2.90s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▎                                                           | 301/1189 [20:46<42:56,  2.90s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▎                                                           | 302/1189 [20:51<52:41,  3.56s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▎                                                           | 302/1189 [20:51<52:41,  3.56s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▎                                                           | 302/1189 [20:51<52:41,  3.56s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▍                                                           | 303/1189 [20:56<59:12,  4.01s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 25%|████████████████████▍                                                           | 303/1189 [20:56<59:12,  4.01s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|███████████████████▉                                                          | 304/1189 [21:01<1:03:55,  4.33s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|███████████████████▉                                                          | 304/1189 [21:01<1:03:55,  4.33s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 3.8708, 'learning_rate': 6e-06, 'epoch': 0.26}
+ 26%|███████████████████▉                                                          | 304/1189 [21:01<1:03:55,  4.33s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████                                                          | 305/1189 [21:06<1:06:56,  4.54s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████                                                          | 305/1189 [21:06<1:06:56,  4.54s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████                                                          | 305/1189 [21:06<1:06:56,  4.54s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████                                                          | 305/1189 [21:06<1:06:56,  4.54s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0563, 'learning_rate': 6.040000000000001e-06, 'epoch': 0.26}
+ 26%|████████████████████                                                          | 305/1189 [21:06<1:06:56,  4.54s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▏                                                         | 307/1189 [21:16<1:09:45,  4.75s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▏                                                         | 307/1189 [21:16<1:09:45,  4.75s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▏                                                         | 307/1189 [21:16<1:09:45,  4.75s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▏                                                         | 308/1189 [21:21<1:10:10,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▏                                                         | 308/1189 [21:21<1:10:10,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▎                                                         | 309/1189 [21:26<1:10:20,  4.80s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▎                                                         | 309/1189 [21:26<1:10:20,  4.80s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 3.9649, 'learning_rate': 6.1e-06, 'epoch': 0.26}
+ 26%|████████████████████▎                                                         | 309/1189 [21:26<1:10:20,  4.80s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▎                                                         | 310/1189 [21:30<1:10:37,  4.82s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▎                                                         | 310/1189 [21:30<1:10:37,  4.82s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▍                                                         | 311/1189 [21:35<1:10:48,  4.84s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▍                                                         | 311/1189 [21:35<1:10:48,  4.84s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1021, 'learning_rate': 6.1400000000000005e-06, 'epoch': 0.26}
+ 26%|████████████████████▍                                                         | 312/1189 [21:40<1:10:39,  4.83s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▍                                                         | 312/1189 [21:40<1:10:39,  4.83s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1044, 'learning_rate': 6.16e-06, 'epoch': 0.26}
+ 26%|████████████████████▍                                                         | 312/1189 [21:40<1:10:39,  4.83s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▌                                                         | 313/1189 [21:45<1:10:12,  4.81s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▌                                                         | 313/1189 [21:45<1:10:12,  4.81s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▌                                                         | 314/1189 [21:50<1:09:49,  4.79s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▌                                                         | 314/1189 [21:50<1:09:49,  4.79s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1376, 'learning_rate': 6.200000000000001e-06, 'epoch': 0.26}
+ 26%|████████████████████▋                                                         | 315/1189 [21:54<1:09:39,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▋                                                         | 315/1189 [21:54<1:09:39,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0619, 'learning_rate': 6.220000000000001e-06, 'epoch': 0.26}
+ 26%|████████████████████▋                                                         | 315/1189 [21:54<1:09:39,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 26%|████████████████████▋                                                         | 315/1189 [21:54<1:09:39,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▋                                                         | 316/1189 [21:59<1:09:33,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▋                                                         | 316/1189 [21:59<1:09:33,  4.78s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▊                                                         | 317/1189 [22:04<1:09:35,  4.79s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▊                                                         | 317/1189 [22:04<1:09:35,  4.79s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3625, 'learning_rate': 6.26e-06, 'epoch': 0.27}
+ 27%|████████████████████▊                                                         | 318/1189 [22:09<1:09:18,  4.77s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▊                                                         | 318/1189 [22:09<1:09:18,  4.77s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2863, 'learning_rate': 6.280000000000001e-06, 'epoch': 0.27}
+ 27%|████████████████████▊                                                         | 318/1189 [22:09<1:09:18,  4.77s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▉                                                         | 319/1189 [22:14<1:09:10,  4.77s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▉                                                         | 319/1189 [22:14<1:09:10,  4.77s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▉                                                         | 320/1189 [22:18<1:08:35,  4.74s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|████████████████████▉                                                         | 320/1189 [22:18<1:08:35,  4.74s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2889, 'learning_rate': 6.3200000000000005e-06, 'epoch': 0.27}
+ 27%|█████████████████████                                                         | 321/1189 [22:23<1:08:09,  4.71s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|█████████████████████                                                         | 321/1189 [22:23<1:08:09,  4.71s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:56:18,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:56:18,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1358, 'learning_rate': 6.360000000000001e-06, 'epoch': 0.27}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:56:18,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:56:18,040 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|█████████████████████▏                                                        | 323/1189 [22:32<1:07:06,  4.65s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|█████████████████████▏                                                        | 323/1189 [22:32<1:07:06,  4.65s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|█████████████████████▎                                                        | 324/1189 [22:37<1:06:37,  4.62s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|█████████████████████▎                                                        | 324/1189 [22:37<1:06:37,  4.62s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2859, 'learning_rate': 6.4000000000000006e-06, 'epoch': 0.27}
+ 27%|█████████████████████▎                                                        | 325/1189 [22:41<1:05:52,  4.57s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|█████████████████████▎                                                        | 325/1189 [22:41<1:05:52,  4.57s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2048, 'learning_rate': 6.42e-06, 'epoch': 0.27}
+ 27%|█████████████████████▍                                                        | 326/1189 [22:45<1:05:14,  4.54s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 27%|█████████████████████▍                                                        | 326/1189 [22:45<1:05:14,  4.54s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3831, 'learning_rate': 6.440000000000001e-06, 'epoch': 0.27}
+ 28%|██���██████████████████▍                                                        | 327/1189 [22:50<1:04:49,  4.51s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▍                                                        | 327/1189 [22:50<1:04:49,  4.51s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3815, 'learning_rate': 6.460000000000001e-06, 'epoch': 0.27}
+ 28%|█████████████████████▍                                                        | 327/1189 [22:50<1:04:49,  4.51s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▌                                                        | 328/1189 [22:54<1:04:22,  4.49s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▌                                                        | 328/1189 [22:54<1:04:22,  4.49s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▌                                                        | 329/1189 [22:59<1:03:43,  4.45s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▌                                                        | 329/1189 [22:59<1:03:43,  4.45s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1473, 'learning_rate': 6.5000000000000004e-06, 'epoch': 0.28}
+ 28%|█████████████████████▋                                                        | 330/1189 [23:03<1:03:00,  4.40s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▋                                                        | 330/1189 [23:03<1:03:00,  4.40s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3793, 'learning_rate': 6.520000000000001e-06, 'epoch': 0.28}
+ 28%|█████████████████████▋                                                        | 331/1189 [23:07<1:02:21,  4.36s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▋                                                        | 331/1189 [23:07<1:02:21,  4.36s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0783, 'learning_rate': 6.540000000000001e-06, 'epoch': 0.28}
+ 28%|█████████████████████▊                                                        | 332/1189 [23:11<1:01:41,  4.32s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▊                                                        | 332/1189 [23:11<1:01:41,  4.32s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2146, 'learning_rate': 6.560000000000001e-06, 'epoch': 0.28}
+ 28%|█████████████████████▊                                                        | 333/1189 [23:16<1:00:53,  4.27s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▊                                                        | 333/1189 [23:16<1:00:53,  4.27s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2755, 'learning_rate': 6.5800000000000005e-06, 'epoch': 0.28}
+ 28%|█████████████████████▉                                                        | 334/1189 [23:20<1:00:20,  4.23s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|█████████████████████▉                                                        | 334/1189 [23:20<1:00:20,  4.23s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0445, 'learning_rate': 6.600000000000001e-06, 'epoch': 0.28}
+ 28%|██████████████████████▌                                                         | 335/1189 [23:24<59:22,  4.17s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▌                                                         | 335/1189 [23:24<59:22,  4.17s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2928, 'learning_rate': 6.620000000000001e-06, 'epoch': 0.28}
+ 28%|██████████████████████▌                                                         | 336/1189 [23:28<58:21,  4.11s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▌                                                         | 336/1189 [23:28<58:21,  4.11s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3136, 'learning_rate': 6.640000000000001e-06, 'epoch': 0.28}
+ 28%|██████████████████████▌                                                         | 336/1189 [23:28<58:21,  4.11s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▋                                                         | 337/1189 [23:32<57:04,  4.02s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▋                                                         | 337/1189 [23:32<57:04,  4.02s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▋                                                         | 337/1189 [23:32<57:04,  4.02s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▋                                                         | 338/1189 [23:35<56:01,  3.95s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▋                                                         | 338/1189 [23:35<56:01,  3.95s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 28%|██████████████████████▋                                                         | 338/1189 [23:35<56:01,  3.95s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|██████████████████████▊                                                         | 339/1189 [23:39<54:48,  3.87s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:57:33,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:57:33,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2516, 'learning_rate': 6.720000000000001e-06, 'epoch': 0.29}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:57:33,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|██████████████████████▉                                                         | 341/1189 [23:46<51:27,  3.64s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|██████████████████████▉                                                         | 341/1189 [23:46<51:27,  3.64s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|██████████████████████▉                                                         | 341/1189 [23:46<51:27,  3.64s/it]g-point operations will not be computed-28 16:54:16,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████                                                         | 342/1189 [23:49<49:15,  3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:41,217 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████                                                         | 342/1189 [23:49<49:15,  3.49s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:41,217 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████                                                         | 343/1189 [23:52<46:45,  3.32s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:41,217 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:57:45,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:57:41,217 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 16:57:45,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:57:41,217 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4456, 'learning_rate': 6.800000000000001e-06, 'epoch': 0.29}
+[WARNING|modeling_utils.py:388] 2022-02-28 16:57:45,276 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:57:41,217 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▏                                                        | 345/1189 [23:57<41:20,  2.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:49,000 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▏                                                        | 345/1189 [23:57<41:20,  2.94s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:49,000 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▎                                                        | 346/1189 [23:59<38:31,  2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:51,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▎                                                        | 346/1189 [23:59<38:31,  2.74s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:51,103 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▎                                                        | 347/1189 [24:01<35:12,  2.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:52,967 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▎                                                        | 347/1189 [24:01<35:12,  2.51s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:52,967 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.6718, 'learning_rate': 6.88e-06, 'epoch': 0.29}
+ 29%|███████████████████████▍                                                        | 349/1189 [24:05<28:30,  2.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:54,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▍                                                        | 349/1189 [24:05<28:30,  2.04s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:54,584 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▌                                                        | 350/1189 [24:07<27:31,  1.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:55,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▌                                                        | 350/1189 [24:07<27:31,  1.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 29%|███████████████████████▌                                                        | 350/1189 [24:07<27:31,  1.97s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▌                                                        | 351/1189 [24:12<41:34,  2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▌                                                        | 351/1189 [24:12<41:34,  2.98s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▋                                                        | 352/1189 [24:17<50:34,  3.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▋                                                        | 352/1189 [24:17<50:34,  3.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.157, 'learning_rate': 6.96e-06, 'epoch': 0.3}
+ 30%|███████████████████████▋                                                        | 352/1189 [24:17<50:34,  3.63s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▊                                                        | 353/1189 [24:22<56:28,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▊                                                        | 353/1189 [24:22<56:28,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▊                                                        | 353/1189 [24:22<56:28,  4.05s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▏                                                      | 354/1189 [24:27<1:00:18,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▏                                                      | 354/1189 [24:27<1:00:18,  4.33s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▎                                                      | 355/1189 [24:32<1:02:48,  4.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▎                                                      | 355/1189 [24:32<1:02:48,  4.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1486, 'learning_rate': 7.0200000000000006e-06, 'epoch': 0.3}
+ 30%|███████████████████████▎                                                      | 355/1189 [24:32<1:02:48,  4.52s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▎                                                      | 356/1189 [24:37<1:04:29,  4.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▎                                                      | 356/1189 [24:37<1:04:29,  4.65s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▍                                                      | 357/1189 [24:42<1:05:24,  4.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▍                                                      | 357/1189 [24:42<1:05:24,  4.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3384, 'learning_rate': 7.06e-06, 'epoch': 0.3}
+ 30%|███████████████████████▍                                                      | 358/1189 [24:47<1:06:01,  4.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▍                                                      | 358/1189 [24:47<1:06:01,  4.77s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0699, 'learning_rate': 7.08e-06, 'epoch': 0.3}
+ 30%|███████████████████████▌                                                      | 359/1189 [24:52<1:06:32,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▌                                                      | 359/1189 [24:52<1:06:32,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.238, 'learning_rate': 7.100000000000001e-06, 'epoch': 0.3}
+ 30%|███████████████████████▌                                                      | 360/1189 [24:56<1:06:52,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▌                                                      | 360/1189 [24:56<1:06:52,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3706, 'learning_rate': 7.1200000000000004e-06, 'epoch': 0.3}
+ 30%|███████████████████████▌                                                      | 360/1189 [24:56<1:06:52,  4.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▋                                                      | 361/1189 [25:01<1:06:40,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▋                                                      | 361/1189 [25:01<1:06:40,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▋                                                      | 362/1189 [25:06<1:06:26,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 30%|███████████████████████▋                                                      | 362/1189 [25:06<1:06:26,  4.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0899, 'learning_rate': 7.16e-06, 'epoch': 0.3}
+ 31%|███████████████████████▊                                                      | 363/1189 [25:11<1:06:12,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|███████████████████████▊                                                      | 363/1189 [25:11<1:06:12,  4.81s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1032, 'learning_rate': 7.180000000000001e-06, 'epoch': 0.31}
+ 31%|███████████████████████▉                                                      | 364/1189 [25:16<1:06:22,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|███████████████████████▉                                                      | 364/1189 [25:16<1:06:22,  4.83s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2998, 'learning_rate': 7.2000000000000005e-06, 'epoch': 0.31}
+ 31%|███████████████████████▉                                                      | 365/1189 [25:20<1:05:49,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|███████████████████████▉                                                      | 365/1189 [25:20<1:05:49,  4.79s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2338, 'learning_rate': 7.22e-06, 'epoch': 0.31}
+ 31%|████████████████████████                                                      | 366/1189 [25:25<1:05:35,  4.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████                                                      | 366/1189 [25:25<1:05:35,  4.78s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1811, 'learning_rate': 7.24e-06, 'epoch': 0.31}
+ 31%|████████████████████████                                                      | 367/1189 [25:30<1:04:49,  4.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████                                                      | 367/1189 [25:30<1:04:49,  4.73s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0937, 'learning_rate': 7.260000000000001e-06, 'epoch': 0.31}
+ 31%|████████████████████████▏                                                     | 368/1189 [25:34<1:04:30,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▏                                                     | 368/1189 [25:34<1:04:30,  4.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.5385, 'learning_rate': 7.280000000000001e-06, 'epoch': 0.31}
+ 31%|████████████████████████▏                                                     | 369/1189 [25:39<1:04:07,  4.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▏                                                     | 369/1189 [25:39<1:04:07,  4.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3573, 'learning_rate': 7.3e-06, 'epoch': 0.31}
+ 31%|████████████████████████▏                                                     | 369/1189 [25:39<1:04:07,  4.69s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▎                                                     | 370/1189 [25:44<1:03:43,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▎                                                     | 370/1189 [25:44<1:03:43,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▎                                                     | 370/1189 [25:44<1:03:43,  4.67s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:57:59,859 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▎                                                     | 371/1189 [25:48<1:03:13,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▎                                                     | 371/1189 [25:48<1:03:13,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▎                                                     | 371/1189 [25:48<1:03:13,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0908, 'learning_rate': 7.360000000000001e-06, 'epoch': 0.31}
+ 31%|████████████████████████▎                                                     | 371/1189 [25:48<1:03:13,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▎                                                     | 371/1189 [25:48<1:03:13,  4.64s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▍                                                     | 373/1189 [25:57<1:02:09,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▍                                                     | 373/1189 [25:57<1:02:09,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▍                                                     | 373/1189 [25:57<1:02:09,  4.57s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▌                                                     | 374/1189 [26:02<1:01:30,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▌                                                     | 374/1189 [26:02<1:01:30,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 31%|████████████████████████▌                                                     | 374/1189 [26:02<1:01:30,  4.53s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|████████████████████████▌                                                     | 375/1189 [26:06<1:01:00,  4.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|████████████████████████▌                                                     | 375/1189 [26:06<1:01:00,  4.50s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|████████████████████████▋                                                     | 376/1189 [26:11<1:00:30,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|████████████████████████▋                                                     | 376/1189 [26:11<1:00:30,  4.47s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2195, 'learning_rate': 7.440000000000001e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▎                                                      | 377/1189 [26:15<59:59,  4.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▎                                                      | 377/1189 [26:15<59:59,  4.43s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1713, 'learning_rate': 7.4600000000000006e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▍                                                      | 378/1189 [26:19<59:30,  4.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▍                                                      | 378/1189 [26:19<59:30,  4.40s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1561, 'learning_rate': 7.48e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▌                                                      | 379/1189 [26:24<59:01,  4.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▌                                                      | 379/1189 [26:24<59:01,  4.37s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1195, 'learning_rate': 7.500000000000001e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▌                                                      | 380/1189 [26:28<58:32,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▌                                                      | 380/1189 [26:28<58:32,  4.34s/it][WARNING|modeling_utils.py:388] 2022-02-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:00:22,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:00:22,692 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0967, 'learning_rate': 7.540000000000001e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▋                                                      | 382/1189 [26:36<57:39,  4.29s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▋                                                      | 382/1189 [26:36<57:39,  4.29s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2816, 'learning_rate': 7.5600000000000005e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▊                                                      | 383/1189 [26:40<56:12,  4.18s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▊                                                      | 383/1189 [26:40<56:12,  4.18s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0563, 'learning_rate': 7.58e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▊                                                      | 384/1189 [26:44<54:56,  4.10s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▊                                                      | 384/1189 [26:44<54:56,  4.10s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3362, 'learning_rate': 7.600000000000001e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▉                                                      | 385/1189 [26:48<53:51,  4.02s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▉                                                      | 385/1189 [26:48<53:51,  4.02s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3688, 'learning_rate': 7.620000000000001e-06, 'epoch': 0.32}
+ 32%|█████████████████████████▉                                                      | 386/1189 [26:52<53:06,  3.97s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 32%|█████████████████████████▉                                                      | 386/1189 [26:52<53:06,  3.97s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1055, 'learning_rate': 7.640000000000001e-06, 'epoch': 0.32}
+ 33%|██████████████████████████                                                      | 387/1189 [26:56<52:08,  3.90s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 33%|██████████████████████████                                                      | 387/1189 [26:56<52:08,  3.90s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3171, 'learning_rate': 7.660000000000001e-06, 'epoch': 0.33}
+ 33%|██████████████████████████                                                      | 388/1189 [26:59<51:05,  3.83s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 33%|██████████████████████████                                                      | 388/1189 [26:59<51:05,  3.83s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:00:53,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:00:53,336 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4034, 'learning_rate': 7.7e-06, 'epoch': 0.33}
+ 33%|██████████████████████████▏                                                     | 390/1189 [27:06<48:24,  3.63s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 33%|██████████████████████████▏                                                     | 390/1189 [27:06<48:24,  3.63s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3698, 'learning_rate': 7.72e-06, 'epoch': 0.33}
+ 33%|██████████████████████████▏                                                     | 390/1189 [27:06<48:24,  3.63s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 33%|██████████████████████████▎                                                     | 391/1189 [27:09<46:20,  3.48s/it]g-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:02,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:02,818 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:05,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:05,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 16:59:41,150 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1896, 'learning_rate': 7.78e-06, 'epoch': 0.33}
+ 33%|██████████████████████████▌                                                     | 394/1189 [27:18<39:45,  3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 33%|██████████████████████████▌                                                     | 394/1189 [27:18<39:45,  3.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 33%|██████████████████████████▌                                                     | 395/1189 [27:20<37:18,  2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 33%|██████████████████████████▌                                                     | 395/1189 [27:20<37:18,  2.82s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:12,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:12,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:14,501 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:14,501 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3534, 'learning_rate': 7.860000000000001e-06, 'epoch': 0.33}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:17,513 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:17,513 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:19,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:19,265 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.6628, 'learning_rate': 7.92e-06, 'epoch': 0.34}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:24,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:01:24,676 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0486, 'learning_rate': 7.94e-06, 'epoch': 0.34}
+ 34%|███████████████████████████                                                     | 402/1189 [27:39<47:09,  3.60s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|███████████████████████████                                                     | 402/1189 [27:39<47:09,  3.60s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1799, 'learning_rate': 7.960000000000002e-06, 'epoch': 0.34}
+ 34%|███████████████████████████                                                     | 403/1189 [27:44<53:08,  4.06s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|███████████████████████████                                                     | 403/1189 [27:44<53:08,  4.06s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1841, 'learning_rate': 7.980000000000002e-06, 'epoch': 0.34}
+ 34%|███████████████████████████▏                                                    | 404/1189 [27:49<56:51,  4.35s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|███████████████████████████▏                                                    | 404/1189 [27:49<56:51,  4.35s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2834, 'learning_rate': 8.000000000000001e-06, 'epoch': 0.34}
+ 34%|███████████████████████████▏                                                    | 405/1189 [27:54<59:41,  4.57s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|███████████████████████████▏                                                    | 405/1189 [27:54<59:41,  4.57s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1536, 'learning_rate': 8.020000000000001e-06, 'epoch': 0.34}
+ 34%|███████████████████████████▏                                                    | 405/1189 [27:54<59:41,  4.57s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▋                                                   | 406/1189 [28:00<1:01:33,  4.72s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▋                                                   | 406/1189 [28:00<1:01:33,  4.72s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▋                                                   | 406/1189 [28:00<1:01:33,  4.72s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▋                                                   | 407/1189 [28:05<1:02:43,  4.81s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▋                                                   | 407/1189 [28:05<1:02:43,  4.81s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▋                                                   | 407/1189 [28:05<1:02:43,  4.81s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▊                                                   | 408/1189 [28:10<1:03:21,  4.87s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▊                                                   | 408/1189 [28:10<1:03:21,  4.87s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▊                                                   | 408/1189 [28:10<1:03:21,  4.87s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▊                                                   | 408/1189 [28:10<1:03:21,  4.87s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1054, 'learning_rate': 8.1e-06, 'epoch': 0.34}
+ 34%|██████████████████████████▊                                                   | 408/1189 [28:10<1:03:21,  4.87s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▊                                                   | 408/1189 [28:10<1:03:21,  4.87s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▉                                                   | 410/1189 [28:19<1:03:22,  4.88s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▉                                                   | 410/1189 [28:19<1:03:22,  4.88s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 34%|██████████████████████████▉                                                   | 410/1189 [28:19<1:03:22,  4.88s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|██████████████████████████▉                                                   | 411/1189 [28:24<1:03:20,  4.88s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:02:19,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:02:19,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2693, 'learning_rate': 8.16e-06, 'epoch': 0.35}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:02:19,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:02:19,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████                                                   | 413/1189 [28:34<1:02:12,  4.81s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████                                                   | 413/1189 [28:34<1:02:12,  4.81s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████                                                   | 413/1189 [28:34<1:02:12,  4.81s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▏                                                  | 414/1189 [28:38<1:01:33,  4.77s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▏                                                  | 414/1189 [28:38<1:01:33,  4.77s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▏                                                  | 415/1189 [28:43<1:01:30,  4.77s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▏                                                  | 415/1189 [28:43<1:01:30,  4.77s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1823, 'learning_rate': 8.220000000000001e-06, 'epoch': 0.35}
+ 35%|███████████████████████████▏                                                  | 415/1189 [28:43<1:01:30,  4.77s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▎                                                  | 416/1189 [28:48<1:01:12,  4.75s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▎                                                  | 416/1189 [28:48<1:01:12,  4.75s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▎                                                  | 416/1189 [28:48<1:01:12,  4.75s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▎                                                  | 417/1189 [28:53<1:00:47,  4.73s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▎                                                  | 417/1189 [28:53<1:00:47,  4.73s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▍                                                  | 418/1189 [28:57<1:00:15,  4.69s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|███████████████████████████▍                                                  | 418/1189 [28:57<1:00:15,  4.69s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0802, 'learning_rate': 8.28e-06, 'epoch': 0.35}
+ 35%|████████████████████████████▏                                                   | 419/1189 [29:02<59:53,  4.67s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▏                                                   | 419/1189 [29:02<59:53,  4.67s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2756, 'learning_rate': 8.3e-06, 'epoch': 0.35}
+ 35%|████████████████████████████▏                                                   | 419/1189 [29:02<59:53,  4.67s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▎                                                   | 420/1189 [29:06<59:13,  4.62s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▎                                                   | 420/1189 [29:06<59:13,  4.62s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▎                                                   | 420/1189 [29:06<59:13,  4.62s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▎                                                   | 421/1189 [29:11<58:52,  4.60s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▎                                                   | 421/1189 [29:11<58:52,  4.60s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▍                                                   | 422/1189 [29:15<58:48,  4.60s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 35%|████████████████████████████▍                                                   | 422/1189 [29:15<58:48,  4.60s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2496, 'learning_rate': 8.36e-06, 'epoch': 0.35}
+ 36%|████████████████████████████▍                                                   | 423/1189 [29:20<58:20,  4.57s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▍                                                   | 423/1189 [29:20<58:20,  4.57s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2382, 'learning_rate': 8.380000000000001e-06, 'epoch': 0.36}
+ 36%|████████████████████████████▍                                                   | 423/1189 [29:20<58:20,  4.57s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▌                                                   | 424/1189 [29:24<57:36,  4.52s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▌                                                   | 424/1189 [29:24<57:36,  4.52s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▌                                                   | 424/1189 [29:24<57:36,  4.52s/it]g-point operations will not be computed-28 17:01:09,504 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▌                                                   | 425/1189 [29:29<56:49,  4.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:21,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▌                                                   | 425/1189 [29:29<56:49,  4.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:21,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▌                                                   | 425/1189 [29:29<56:49,  4.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:21,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2347, 'learning_rate': 8.44e-06, 'epoch': 0.36}
+ 36%|████████████████████████████▌                                                   | 425/1189 [29:29<56:49,  4.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:21,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▌                                                   | 425/1189 [29:29<56:49,  4.46s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:21,475 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▋                                                   | 427/1189 [29:37<56:01,  4.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:30,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▋                                                   | 427/1189 [29:37<56:01,  4.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:30,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▋                                                   | 427/1189 [29:37<56:01,  4.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:30,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1574, 'learning_rate': 8.48e-06, 'epoch': 0.36}
+ 36%|████████████████████████████▋                                                   | 427/1189 [29:37<56:01,  4.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:30,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▋                                                   | 427/1189 [29:37<56:01,  4.41s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:30,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▊                                                   | 429/1189 [29:46<55:09,  4.35s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        [WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3594, 'learning_rate': 8.52e-06, 'epoch': 0.36}
+ 36%|████████████████████████████▉                                                   | 431/1189 [29:55<54:23,  4.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|████████████████████████████▉                                                   | 431/1189 [29:55<54:23,  4.31s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2223, 'learning_rate': 8.540000000000001e-06, 'epoch': 0.36}
+ 36%|█████████████████████████████                                                   | 432/1189 [29:59<53:44,  4.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|█████████████████████████████                                                   | 432/1189 [29:59<53:44,  4.26s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1815, 'learning_rate': 8.560000000000001e-06, 'epoch': 0.36}
+ 36%|█████████████████████████████▏                                                  | 433/1189 [30:03<53:06,  4.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 36%|█████████████████████████████▏                                                  | 433/1189 [30:03<53:06,  4.21s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2792, 'learning_rate': 8.580000000000001e-06, 'epoch': 0.36}
+ 37%|█████████████████████████████▏                                                  | 434/1189 [30:07<52:10,  4.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▏                                                  | 434/1189 [30:07<52:10,  4.15s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1954, 'learning_rate': 8.6e-06, 'epoch': 0.36}
+ 37%|█████████████████████████████▎                                                  | 435/1189 [30:11<51:14,  4.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▎                                                  | 435/1189 [30:11<51:14,  4.08s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4341, 'learning_rate': 8.62e-06, 'epoch': 0.37}
+ 37%|█████████████████████████████▎                                                  | 436/1189 [30:14<50:13,  4.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▎                                                  | 436/1189 [30:14<50:13,  4.00s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1195, 'learning_rate': 8.64e-06, 'epoch': 0.37}
+ 37%|█████████████████████████████▍                                                  | 437/1189 [30:18<49:16,  3.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▍                                                  | 437/1189 [30:18<49:16,  3.93s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1726, 'learning_rate': 8.66e-06, 'epoch': 0.37}
+ 37%|█████████████████████████████▍                                                  | 438/1189 [30:22<48:02,  3.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▍                                                  | 438/1189 [30:22<48:02,  3.84s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:15,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:15,977 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3113, 'learning_rate': 8.700000000000001e-06, 'epoch': 0.37}
+ 37%|█████████████████████████████▌                                                  | 440/1189 [30:29<45:10,  3.62s/it]g-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|██████████████████████��██████▌                                                  | 440/1189 [30:29<45:10,  3.62s/it]g-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2111, 'learning_rate': 8.720000000000001e-06, 'epoch': 0.37}
+ 37%|█████████████████████████████▋                                                  | 441/1189 [30:32<43:26,  3.48s/it]g-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▋                                                  | 441/1189 [30:32<43:26,  3.48s/it]g-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:25,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:25,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2824, 'learning_rate': 8.76e-06, 'epoch': 0.37}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:25,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:03:38,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▊                                                  | 443/1189 [30:38<39:49,  3.20s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▊                                                  | 444/1189 [30:40<37:37,  3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 37%|█████████████████████████████▊                                                  | 444/1189 [30:40<37:37,  3.03s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:33,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:33,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:35,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:35,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:37,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:37,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:39,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:39,088 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.4837, 'learning_rate': 8.880000000000001e-06, 'epoch': 0.38}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:42,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:42,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.591, 'learning_rate': 8.920000000000001e-06, 'epoch': 0.38}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:04:42,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▎                                                 | 451/1189 [30:57<36:32,  2.97s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▎                                                 | 451/1189 [30:57<36:32,  2.97s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1316, 'learning_rate': 8.94e-06, 'epoch': 0.38}
+ 38%|██████████████████████████████▎                                                 | 451/1189 [30:57<36:32,  2.97s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▍                                                 | 452/1189 [31:02<44:18,  3.61s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▍                                                 | 452/1189 [31:02<44:18,  3.61s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▍                                                 | 453/1189 [31:07<49:27,  4.03s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▍                                                 | 453/1189 [31:07<49:27,  4.03s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1289, 'learning_rate': 8.98e-06, 'epoch': 0.38}
+ 38%|██████████████████████████████▍                                                 | 453/1189 [31:07<49:27,  4.03s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▌                                                 | 454/1189 [31:12<52:52,  4.32s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▌                                                 | 454/1189 [31:12<52:52,  4.32s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▌                                                 | 455/1189 [31:17<54:57,  4.49s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▌                                                 | 455/1189 [31:17<54:57,  4.49s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0944, 'learning_rate': 9.020000000000002e-06, 'epoch': 0.38}
+ 38%|██████████████████████████████▋                                                 | 456/1189 [31:22<56:34,  4.63s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▋                                                 | 456/1189 [31:22<56:34,  4.63s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1625, 'learning_rate': 9.040000000000002e-06, 'epoch': 0.38}
+ 38%|██████████████████████████████▋                                                 | 456/1189 [31:22<56:34,  4.63s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▋                                                 | 457/1189 [31:27<57:17,  4.70s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▋                                                 | 457/1189 [31:27<57:17,  4.70s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 38%|██████████████████████████████▋                                                 | 457/1189 [31:27<57:17,  4.70s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████████▊                                                 | 458/1189 [31:32<57:46,  4.74s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████████▊                                                 | 458/1189 [31:32<57:46,  4.74s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████████▉                                                 | 459/1189 [31:37<58:09,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████████▉                                                 | 459/1189 [31:37<58:09,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0962, 'learning_rate': 9.100000000000001e-06, 'epoch': 0.39}
+ 39%|██████████████████████████████▉                                                 | 459/1189 [31:37<58:09,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████████▉                                                 | 460/1189 [31:41<58:08,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████████▉                                                 | 460/1189 [31:41<58:08,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████████▉                                                 | 460/1189 [31:41<58:08,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████                                                 | 461/1189 [31:46<57:57,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████                                                 | 461/1189 [31:46<57:57,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████                                                 | 462/1189 [31:51<57:54,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████                                                 | 462/1189 [31:51<57:54,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2095, 'learning_rate': 9.16e-06, 'epoch': 0.39}
+ 39%|███████████████████████████████                                                 | 462/1189 [31:51<57:54,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▏                                                | 463/1189 [31:56<58:05,  4.80s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▏                                                | 463/1189 [31:56<58:05,  4.80s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▏                                                | 463/1189 [31:56<58:05,  4.80s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▏                                                | 464/1189 [32:01<57:50,  4.79s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▏                                                | 464/1189 [32:01<57:50,  4.79s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▎                                                | 465/1189 [32:05<57:39,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▎                                                | 465/1189 [32:05<57:39,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2855, 'learning_rate': 9.220000000000002e-06, 'epoch': 0.39}
+ 39%|███████████████████████████████▎                                                | 465/1189 [32:05<57:39,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▎                                                | 465/1189 [32:05<57:39,  4.78s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|██████████████████████████��████▎                                                | 466/1189 [32:10<57:20,  4.76s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▎                                                | 466/1189 [32:10<57:20,  4.76s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▎                                                | 466/1189 [32:10<57:20,  4.76s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▍                                                | 467/1189 [32:15<56:49,  4.72s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▍                                                | 467/1189 [32:15<56:49,  4.72s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▍                                                | 468/1189 [32:19<56:22,  4.69s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▍                                                | 468/1189 [32:19<56:22,  4.69s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2952, 'learning_rate': 9.280000000000001e-06, 'epoch': 0.39}
+ 39%|███████████████████████████████▌                                                | 469/1189 [32:24<56:09,  4.68s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 39%|███████████████████████████████▌                                                | 469/1189 [32:24<56:09,  4.68s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0674, 'learning_rate': 9.3e-06, 'epoch': 0.39}
+ 39%|███████████████████████████████▌                                                | 469/1189 [32:24<56:09,  4.68s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▌                                                | 470/1189 [32:29<55:43,  4.65s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▌                                                | 470/1189 [32:29<55:43,  4.65s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▌                                                | 470/1189 [32:29<55:43,  4.65s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▋                                                | 471/1189 [32:33<55:21,  4.63s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▋                                                | 471/1189 [32:33<55:21,  4.63s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▋                                                | 471/1189 [32:33<55:21,  4.63s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▊                                                | 472/1189 [32:38<54:52,  4.59s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▊                                                | 472/1189 [32:38<54:52,  4.59s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▊                                                | 472/1189 [32:38<54:52,  4.59s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▊                                                | 473/1189 [32:42<54:29,  4.57s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▊                                                | 473/1189 [32:42<54:29,  4.57s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▉                                                | 474/1189 [32:47<54:08,  4.54s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▉                                                | 474/1189 [32:47<54:08,  4.54s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3065, 'learning_rate': 9.4e-06, 'epoch': 0.4}
+ 40%|███████████████████████████████▉                                                | 475/1189 [32:51<53:29,  4.49s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|███████████████████████████████▉                                                | 475/1189 [32:51<53:29,  4.49s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3725, 'learning_rate': 9.42e-06, 'epoch': 0.4}
+ 40%|████████████████████████████████                                                | 476/1189 [32:55<52:59,  4.46s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|████████████████████████████████                                                | 476/1189 [32:55<52:59,  4.46s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2321, 'learning_rate': 9.440000000000001e-06, 'epoch': 0.4}
+ 40%|████████████████████████████████                                                | 477/1189 [33:00<52:23,  4.42s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|████████████████████████████████                                                | 477/1189 [33:00<52:23,  4.42s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0517, 'learning_rate': 9.460000000000001e-06, 'epoch': 0.4}
+ 40%|████████████████████████████████▏                                               | 478/1189 [33:04<51:45,  4.37s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|████████████████████████████████▏                                               | 478/1189 [33:04<51:45,  4.37s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1719, 'learning_rate': 9.48e-06, 'epoch': 0.4}
+ 40%|████████████████████████████████▏                                               | 479/1189 [33:08<51:21,  4.34s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|████████████████████████████████▏                                               | 479/1189 [33:08<51:21,  4.34s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2294, 'learning_rate': 9.5e-06, 'epoch': 0.4}
+                                                                                                                        g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+                                                                                                                        g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2519, 'learning_rate': 9.52e-06, 'epoch': 0.4}
+ 40%|████████████████████████████████▎                                               | 481/1189 [33:17<50:14,  4.26s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 40%|████████████████████████████████▎                                               | 481/1189 [33:17<50:14,  4.26s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1538, 'learning_rate': 9.54e-06, 'epoch': 0.4}
+ 41%|████████████████████████████████▍                                               | 482/1189 [33:21<49:57,  4.24s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▍                                               | 482/1189 [33:21<49:57,  4.24s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2508, 'learning_rate': 9.56e-06, 'epoch': 0.41}
+ 41%|████████████████████████████████▍                                               | 483/1189 [33:25<49:27,  4.20s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▍                                               | 483/1189 [33:25<49:27,  4.20s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2302, 'learning_rate': 9.58e-06, 'epoch': 0.41}
+ 41%|████████████████████████████████▌                                               | 484/1189 [33:29<48:29,  4.13s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▌                                               | 484/1189 [33:29<48:29,  4.13s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2961, 'learning_rate': 9.600000000000001e-06, 'epoch': 0.41}
+ 41%|████████████████████████████████▋                                               | 485/1189 [33:33<47:41,  4.06s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▋                                               | 485/1189 [33:33<47:41,  4.06s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2522, 'learning_rate': 9.620000000000001e-06, 'epoch': 0.41}
+ 41%|████████████████████████████████▋                                               | 485/1189 [33:33<47:41,  4.06s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▋                                               | 486/1189 [33:37<47:07,  4.02s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▋                                               | 486/1189 [33:37<47:07,  4.02s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▊                                               | 487/1189 [33:41<46:20,  3.96s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▊                                               | 487/1189 [33:41<46:20,  3.96s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.0833, 'learning_rate': 9.66e-06, 'epoch': 0.41}
+ 41%|████████████████████████████████▊                                               | 487/1189 [33:41<46:20,  3.96s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▊                                               | 488/1189 [33:44<45:27,  3.89s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▊                                               | 488/1189 [33:44<45:27,  3.89s/it]g-point operations will not be computed-28 17:04:29,793 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▉                                               | 489/1189 [33:48<44:20,  3.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▉                                               | 489/1189 [33:48<44:20,  3.80s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▉                                               | 490/1189 [33:51<43:12,  3.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|████████████████████████████████▉                                               | 490/1189 [33:51<43:12,  3.71s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.2419, 'learning_rate': 9.72e-06, 'epoch': 0.41}
+ 41%|█████████████████████████████████                                               | 491/1189 [33:55<41:51,  3.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|█████████████████████████████████                                               | 491/1189 [33:55<41:51,  3.60s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:07:48,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:07:48,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.1422, 'learning_rate': 9.760000000000001e-06, 'epoch': 0.41}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:07:48,484 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|█████████████████████████████████▏                                              | 493/1189 [34:01<38:31,  3.32s/it]g-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 41%|█████████████████████████████████▏                                              | 493/1189 [34:01<38:31,  3.32s/it]g-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:07:54,193 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:07:56,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[WARNING|modeling_utils.py:388] 2022-02-28 17:07:56,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.5124, 'learning_rate': 9.820000000000001e-06, 'epoch': 0.42}
+[WARNING|modeling_utils.py:388] 2022-02-28 17:07:56,639 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-28 17:07:40,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 42%|█████████████████████████████████▎                                              | 496/1189 [34:08<31:26,  2.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:59,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 42%|█████████████████████████████████▎                                              | 496/1189 [34:08<31:26,  2.72s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:07:59,912 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+{'loss': 4.3506, 'learning_rate': 9.86e-06, 'epoch': 0.42}
+ 42%|█████████████████████████████████▌                                              | 498/1189 [34:12<25:44,  2.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:08:01,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 42%|█████████████████████████████████▌                                              | 498/1189 [34:12<25:44,  2.23s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:08:01,699 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 42%|█████████████████████████████████▌                                              | 499/1189 [34:13<23:07,  2.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:08:04,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+ 42%|█████████████████████████████████▌                                              | 499/1189 [34:13<23:07,  2.01s/it][WARNING|modeling_utils.py:388] 2022-02-28 17:08:04,636 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed
+[INFO|trainer.py:2366] 2022-02-28 17:08:05,927 >>   Num examples = 2642              | 500/1189 [34:15<22:25,  1.95s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|trainer.py:2366] 2022-02-28 17:08:05,927 >>   Num examples = 2642              | 500/1189 [34:15<22:25,  1.95s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|trainer.py:2366] 2022-02-28 17:08:05,927 >>   Num examples = 2642              | 500/1189 [34:15<22:25,  1.95s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|trainer.py:2366] 2022-02-28 17:08:05,927 >>   Num examples = 2642              | 500/1189 [34:15<22:25,  1.95s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  2%|█▌                                                                                 | 4/221 [00:08<08:58,  2.48s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  2%|█▉                                                                                 | 5/221 [00:11<09:42,  2.70s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  3%|██▎                                                                                | 6/221 [00:15<10:21,  2.89s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  3%|██▋                                                                                | 7/221 [00:19<11:19,  3.17s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  4%|███                                                                                | 8/221 [00:22<11:06,  3.13s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  4%|███▍                                                                               | 9/221 [00:25<10:58,  3.11s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  5%|███▋                                                                              | 10/221 [00:29<11:50,  3.37s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  5%|████                                                                              | 11/221 [00:33<12:46,  3.65s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  5%|████▍                                                                             | 12/221 [00:36<12:00,  3.45s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  6%|████▊                                                                             | 13/221 [00:39<11:36,  3.35s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  6%|█████▏                                                                            | 14/221 [00:42<11:38,  3.37s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  7%|█████▌                                                                            | 15/221 [00:47<13:03,  3.81s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  7%|█████▉                                                                            | 16/221 [00:52<13:57,  4.09s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  8%|██████▎                                                                           | 17/221 [00:55<13:18,  3.92s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  8%|██████▋                                                                           | 18/221 [00:59<13:07,  3.88s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  9%|███████                                                                           | 19/221 [01:03<12:25,  3.69s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+  9%|███████▍                                                                          | 20/221 [01:06<11:50,  3.53s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 10%|███████▊                                                                          | 21/221 [01:09<11:08,  3.34s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 10%|████████▏                                                                         | 22/221 [01:12<10:55,  3.29s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 10%|████████▌                                                                         | 23/221 [01:15<10:43,  3.25s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 11%|████████▉                                                                         | 24/221 [01:19<11:15,  3.43s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 11%|█████████▎                                                                        | 25/221 [01:23<11:46,  3.61s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 12%|█████████▋                                                                        | 26/221 [01:27<11:59,  3.69s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 12%|██████████                                                                        | 27/221 [01:29<10:59,  3.40s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 13%|██████████▍                                                                       | 28/221 [01:33<11:30,  3.58s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 13%|██████████▊                                                                       | 29/221 [01:38<12:12,  3.81s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 14%|███████████▏                                                                      | 30/221 [01:41<11:21,  3.57s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 14%|███████████▌                                                                      | 31/221 [01:43<10:20,  3.27s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 14%|███████████▊                                                                      | 32/221 [01:47<10:16,  3.26s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 15%|████████████▏                                                                     | 33/221 [01:50<10:40,  3.41s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 15%|████████████▌                                                                     | 34/221 [01:54<10:46,  3.46s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 16%|████████████▉                                                                     | 35/221 [01:57<10:21,  3.34s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 16%|█████████████▎                                                                    | 36/221 [02:00<10:15,  3.33s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 17%|█████████████▋                                                                    | 37/221 [02:05<11:08,  3.63s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 17%|██████████████                                                                    | 38/221 [02:08<10:32,  3.45s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 18%|██████████████▍                                                                   | 39/221 [02:11<10:46,  3.55s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 18%|██████████████▊                                                                   | 40/221 [02:14<10:13,  3.39s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 19%|███████████████▏                                                                  | 41/221 [02:18<10:24,  3.47s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 19%|███████████████▌                                                                  | 42/221 [02:23<11:17,  3.79s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 19%|███████████████▉                                                                  | 43/221 [02:26<10:49,  3.65s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 20%|████████████████▎                                                                 | 44/221 [02:31<11:50,  4.01s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 20%|████████████████▋                                                                 | 45/221 [02:35<12:21,  4.21s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 21%|█████████████████                                                                 | 46/221 [02:40<12:12,  4.19s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 21%|█████████████████▍                                                                | 47/221 [02:44<11:56,  4.12s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 22%|█████████████████▊                                                                | 48/221 [02:48<11:51,  4.11s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 22%|██████████████████▏                                                               | 49/221 [02:51<11:23,  3.97s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 23%|██████████████████▌                                                               | 50/221 [02:55<11:21,  3.99s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 23%|██████████████████▉                                                               | 51/221 [02:59<10:39,  3.76s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 24%|███████████████████▎                                                              | 52/221 [03:02<09:57,  3.54s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 24%|███████████████████▋                                                              | 53/221 [03:05<09:24,  3.36s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 24%|████████████████████                                                              | 54/221 [03:08<09:40,  3.47s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 25%|████████████████████▍                                                             | 55/221 [03:12<09:42,  3.51s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 25%|████████████████████▊                                                             | 56/221 [03:16<10:28,  3.81s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 26%|█████████████████████▏                                                            | 57/221 [03:20<10:31,  3.85s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 26%|█████████████████████▌                                                            | 58/221 [03:24<10:06,  3.72s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 27%|█████████████████████▉                                                            | 59/221 [03:27<09:32,  3.54s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 27%|██████████████████████▎                                                           | 60/221 [03:29<08:47,  3.27s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 28%|██████████████████████▋                                                           | 61/221 [03:33<09:00,  3.38s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 28%|███████████████████████                                                           | 62/221 [03:36<08:50,  3.34s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 29%|███████████████████████▍                                                          | 63/221 [03:40<08:56,  3.40s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 29%|███████████████████████▋                                                          | 64/221 [03:43<08:50,  3.38s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 29%|████████████████████████                                                          | 65/221 [03:47<08:44,  3.36s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 30%|████████████████████████▍                                                         | 66/221 [03:50<08:49,  3.42s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 30%|████████████████████████▊                                                         | 67/221 [03:53<08:16,  3.22s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 31%|█████████████████████████▏                                                        | 68/221 [03:57<08:56,  3.50s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 31%|█████████████████████████▌                                                        | 69/221 [04:00<08:32,  3.37s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 32%|█████████████████████████▉                                                        | 70/221 [04:03<08:24,  3.34s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 32%|██████████████████████████▎                                                       | 71/221 [04:07<08:13,  3.29s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 33%|██████████████████████████▋                                                       | 72/221 [04:09<07:48,  3.15s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 33%|███████████████████████████                                                       | 73/221 [04:13<08:05,  3.28s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 33%|███████████████████████████▍                                                      | 74/221 [04:16<08:02,  3.29s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 34%|███████████████████████████▊                                                      | 75/221 [04:19<07:58,  3.27s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 34%|████████████████████████████▏                                                     | 76/221 [04:23<07:52,  3.26s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 35%|████████████████████████████▌                                                     | 77/221 [04:26<07:48,  3.25s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 35%|████████████████████████████▉                                                     | 78/221 [04:29<07:56,  3.33s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 36%|█████████████████████████████▎                                                    | 79/221 [04:32<07:38,  3.23s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 36%|█████████████████████████████▋                                                    | 80/221 [04:36<07:33,  3.22s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 37%|██████████████████████████████                                                    | 81/221 [04:39<07:56,  3.41s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 37%|██████████████████████████████▍                                                   | 82/221 [04:44<08:28,  3.66s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 38%|██████████████████████████████▊                                                   | 83/221 [04:48<08:53,  3.87s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 38%|███████████████████████████████▏                                                  | 84/221 [04:52<08:54,  3.91s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 38%|███████████████████████████████▌                                                  | 85/221 [04:57<09:14,  4.08s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 39%|████████��██████████████████████▉                                                  | 86/221 [05:00<09:01,  4.01s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 39%|████████████████████████████████▎                                                 | 87/221 [05:05<09:17,  4.16s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 40%|████████████████████████████████▋                                                 | 88/221 [05:08<08:42,  3.93s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 40%|█████████████████████████████████                                                 | 89/221 [05:12<08:11,  3.73s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 41%|█████████████████████████████████▍                                                | 90/221 [05:15<08:07,  3.73s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 41%|█████████████████████████████████▊                                                | 91/221 [05:19<08:20,  3.85s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 42%|██████████████████████████████████▏                                               | 92/221 [05:24<08:32,  3.97s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 42%|██████████████████████████████████▌                                               | 93/221 [05:28<08:37,  4.04s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 43%|██████████████████████████████████▉                                               | 94/221 [05:32<08:24,  3.98s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 43%|███████████████████████████████████▏                                              | 95/221 [05:36<08:23,  3.99s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 43%|███████████████████████████████████▌                                              | 96/221 [05:40<08:12,  3.94s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 44%|███████████████████████████████████▉                                              | 97/221 [05:44<08:22,  4.06s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 44%|████████████████████████████████████▎                                             | 98/221 [05:48<08:09,  3.98s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 45%|████████████████████████████████████▋                                             | 99/221 [05:51<07:25,  3.66s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 45%|████████████████████████████████████▋                                            | 100/221 [05:54<07:23,  3.67s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 46%|█████████████████████████████████████                                            | 101/221 [05:57<07:02,  3.52s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 46%|█████████████████████████████████████▍                                           | 102/221 [06:00<06:38,  3.35s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 47%|█████████████████████████████████████▊                                           | 103/221 [06:04<06:54,  3.51s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 47%|██████████████████████████████████████                                           | 104/221 [06:08<07:06,  3.65s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 48%|██████████████████████████████████████▍                                          | 105/221 [06:13<07:27,  3.86s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 48%|█████████████████████████████████���████▊                                          | 106/221 [06:17<07:26,  3.88s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 48%|███████████████████████████████████████▏                                         | 107/221 [06:20<06:54,  3.64s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 49%|███████████████████████████████████████▌                                         | 108/221 [06:24<07:13,  3.84s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 49%|███████████████████████████████████████▉                                         | 109/221 [06:28<07:16,  3.90s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 50%|████████████████████████████████████████▎                                        | 110/221 [06:31<06:56,  3.75s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 50%|████████████████████████████████████████▋                                        | 111/221 [06:35<06:39,  3.63s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 51%|█████████████████████████████████████████                                        | 112/221 [06:39<06:42,  3.69s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 51%|█████████████████████████████████████████▍                                       | 113/221 [06:42<06:37,  3.68s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 52%|█████████████████████████████████████████▊                                       | 114/221 [06:46<06:23,  3.59s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 52%|██████████████████████████████████████████▏                                      | 115/221 [06:49<06:16,  3.56s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 52%|██████████��███████████████████████████████▌                                      | 116/221 [06:52<06:02,  3.45s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 53%|██████████████████████████████████████████▉                                      | 117/221 [06:56<05:55,  3.42s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 53%|███████████████████████████████████████████▏                                     | 118/221 [06:59<06:05,  3.54s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 54%|███████████████████████████████████████████▌                                     | 119/221 [07:04<06:28,  3.81s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 54%|███████████████████████████████████████████▉                                     | 120/221 [07:08<06:42,  3.98s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 55%|████████████████████████████████████████████▎                                    | 121/221 [07:12<06:27,  3.88s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 55%|████████████████████████████████████████████▋                                    | 122/221 [07:14<05:43,  3.47s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 56%|█████████████████████████████████████████████                                    | 123/221 [07:17<05:09,  3.16s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 56%|█████████████████████████████████████████████▍                                   | 124/221 [07:20<05:04,  3.14s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 57%|█████████████████████████████████████████████▊                                   | 125/221 [07:24<05:22,  3.36s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 57%|██████████████████████████████████████████████▏                                  | 126/221 [07:27<05:03,  3.19s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 57%|██████████████████████████████████████████████▌                                  | 127/221 [07:29<04:50,  3.09s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 58%|██████████████████████████████████████████████▉                                  | 128/221 [07:32<04:33,  2.94s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 58%|███████████████████████████████████████████████▎                                 | 129/221 [07:36<04:51,  3.17s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 59%|███████████████████████████████████████████████▋                                 | 130/221 [07:38<04:35,  3.03s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 59%|████████████████████████████████████████████████                                 | 131/221 [07:42<04:48,  3.21s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 60%|████████████████████████████████████████████████▍                                | 132/221 [07:45<04:29,  3.03s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 60%|████████████████████████████████████████████████▋                                | 133/221 [07:48<04:30,  3.07s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 61%|█████████████████████████████████████████████████                                | 134/221 [07:51<04:18,  2.97s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 61%|█████████████████████████████████████████████████▍                               | 135/221 [07:54<04:22,  3.05s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 62%|█████████████████████████████████████████████████▊                               | 136/221 [07:58<04:38,  3.27s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 62%|██████████████████████████████████████████████████▏                              | 137/221 [08:01<04:42,  3.36s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 62%|██████████████████████████████████████████████████▌                              | 138/221 [08:05<04:50,  3.50s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 63%|██████████████████████████████████████████████████▉                              | 139/221 [08:09<04:52,  3.56s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 63%|███████████████████████████████████████████████████▎                             | 140/221 [08:11<04:21,  3.22s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 64%|███████████████████████████████████████████████████▋                             | 141/221 [08:14<04:18,  3.23s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 64%|████████████████████████████████████████████████████                             | 142/221 [08:17<04:10,  3.17s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 65%|████████████████████████████████████████████████████▍                            | 143/221 [08:20<03:51,  2.97s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 65%|████████████████████████████████████████████████████▊                            | 144/221 [08:24<04:08,  3.23s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 66%|█████████████████████████████████████████████████████▏                           | 145/221 [08:27<04:02,  3.19s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 66%|█████████████████████████████████████████████████████▌                           | 146/221 [08:31<04:12,  3.37s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 67%|█████████████████████████████████████████████████████▉                           | 147/221 [08:33<03:57,  3.21s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 67%|██████████████████████████████████████████████████████▏                          | 148/221 [08:37<03:57,  3.25s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 67%|██████████████████████████████████████████████████████▌                          | 149/221 [08:40<03:56,  3.28s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 68%|██████████████████████████████████████████████████████▉                          | 150/221 [08:43<03:51,  3.26s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 68%|███████████████████████████████████████████████████████▎                         | 151/221 [08:47<03:58,  3.41s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 69%|███████████████████████████████████████████████████████▋                         | 152/221 [08:50<03:50,  3.34s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 69%|████████████████████████████████████████████████████████                         | 153/221 [08:53<03:43,  3.28s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 70%|████████████████████████████████████████████████████████▍                        | 154/221 [08:57<03:43,  3.33s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 70%|████████████████████████████████████████████████████████▊                        | 155/221 [09:00<03:40,  3.35s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 71%|█████████████████████████████████████████████████████████▏                       | 156/221 [09:04<03:40,  3.39s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 71%|█████████████████████████████████████████████████████████▌                       | 157/221 [09:07<03:24,  3.19s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 71%|█████████████████████████████████████████████████████████▉                       | 158/221 [09:11<03:48,  3.63s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 72%|██████████████████████████████████████████████████████████▎                      | 159/221 [09:15<03:47,  3.67s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 72%|██████████████████████████████████████████████████████████▋                      | 160/221 [09:19<03:52,  3.81s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 73%|███████████████████████████████████████████████████████████                      | 161/221 [09:23<03:55,  3.93s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 73%|███████████████████████████████████████████████████████████▍                     | 162/221 [09:27<03:52,  3.94s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 74%|███████████████████████████████████████████████████████████▋                     | 163/221 [09:32<03:55,  4.07s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 74%|████████████████████████████████████████████████████████████                     | 164/221 [09:36<03:59,  4.21s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 75%|████████████████████████████████████████████████████████████▍                    | 165/221 [09:40<03:45,  4.03s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 75%|████████████████████████████████████████████████████████████▊                    | 166/221 [09:43<03:26,  3.75s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 76%|█████████████████████████████████████████████████████████████▏                   | 167/221 [09:46<03:19,  3.69s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 76%|█████████████████████████████████████████████████████████████▌                   | 168/221 [09:50<03:06,  3.51s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 76%|█████████████████████████████████████████████████████████████▉                   | 169/221 [09:53<03:07,  3.61s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 77%|██████████████████████████████████████████████████████████████▎                  | 170/221 [09:57<03:07,  3.68s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 77%|██████████████████████████████████████████████████████████████▋                  | 171/221 [10:01<03:04,  3.68s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 78%|███████████████████████████████████████████████████████████████                  | 172/221 [10:04<02:54,  3.57s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 78%|███████████████████████████████████████████████████████████████▍                 | 173/221 [10:08<02:50,  3.55s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 79%|███████████████████████████████████████████████████████████████▊                 | 174/221 [10:11<02:40,  3.41s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 79%|████████████████████████████████████████████████████████████████▏                | 175/221 [10:14<02:32,  3.32s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 79%|████████████████████████████████████████████████████████████████▏                | 175/221 [10:14<02:32,  3.32s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 79%|████████████████████████████████████████████████████████████████▏                | 175/221 [10:14<02:32,  3.32s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 80%|████████████████████████████████████████████████████████████████▊                | 177/221 [10:21<02:25,  3.31s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 81%|█████████████████████████████████████████████████████████████████▏               | 178/221 [10:24<02:28,  3.45s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 81%|█████████████████████████████████████████████████████████████████▌               | 179/221 [10:28<02:22,  3.40s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 81%|█████████████████████████████████████████████████████████████████▉               | 180/221 [10:32<02:29,  3.64s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 82%|██████████████████████████████████████████████████████████████████▎              | 181/221 [10:36<02:29,  3.73s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 82%|██████████████████████████████████████████████████████████████████▋              | 182/221 [10:39<02:24,  3.70s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 83%|███████████████████████████████████████████████████████████████████              | 183/221 [10:44<02:27,  3.87s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 83%|███████████████████████████████████████████████████████████████████▍             | 184/221 [10:48<02:22,  3.84s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 84%|███████████████████████████████████████████████████████████████████▊             | 185/221 [10:51<02:12,  3.69s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 84%|████████████████████████████████████████████████████████████████████▏            | 186/221 [10:55<02:16,  3.91s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 85%|████████████████████████████████████████████████████████████████████▌            | 187/221 [10:59<02:08,  3.78s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 85%|████████████████████████████████████████████████████████████████████▉            | 188/221 [11:03<02:08,  3.90s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 86%|█████████████████████████████████████████████████████████████████████▎           | 189/221 [11:07<02:06,  3.96s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 86%|█████████████████████████████████████████████████████████████████████▋           | 190/221 [11:11<02:06,  4.10s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 86%|██████████████████████████████████████████████████████████████████████           | 191/221 [11:16<02:07,  4.24s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 87%|██████████████████████████████████████████████████████████████████████▎          | 192/221 [11:20<02:02,  4.23s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 87%|██████████████████████████████████████████████████████████████████████▋          | 193/221 [11:24<01:50,  3.95s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 88%|███████████████████████████████████████████████████████████████████████          | 194/221 [11:27<01:40,  3.71s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 88%|███████████████████████████████████████████████████████████████████████▍         | 195/221 [11:30<01:31,  3.50s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 89%|███████████████████████████████████████████████████████████████████████▊         | 196/221 [11:33<01:25,  3.43s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 89%|████████████████████████████████████████████████████████████████████████▏        | 197/221 [11:36<01:18,  3.26s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 90%|████████████████████████████████████████████████████████████████████████▌        | 198/221 [11:40<01:21,  3.54s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 90%|████████████████████████████████████████████████████████████████████████▉        | 199/221 [11:44<01:23,  3.81s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 90%|█████████████████████████████████████████████████████████████████████████▎       | 200/221 [11:48<01:18,  3.72s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 91%|█████████████████████████████████████████████████████████████████████████▋       | 201/221 [11:51<01:11,  3.60s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 91%|██████████████████████████████████████████████████████████████████████████       | 202/221 [11:54<01:04,  3.38s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 92%|██████████████████████████████████████████████████████████████████████████▍      | 203/221 [11:58<01:02,  3.45s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 92%|██████████████████████████████████████████████████████████████████████████▊      | 204/221 [12:02<01:03,  3.75s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 93%|███████████████████████████████████████████████████████████████████████████▏     | 205/221 [12:07<01:05,  4.07s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 93%|███████████████████████████████████████████████████████████████████████████▌     | 206/221 [12:12<01:03,  4.24s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 94%|███████████████████████████████████████████████████████████████████████████▊     | 207/221 [12:15<00:56,  4.02s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 94%|████████████████████████████████████████████████████████████████████████████▏    | 208/221 [12:19<00:51,  3.96s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 95%|████████████████████████████████████████████████████████████████████████████▌    | 209/221 [12:22<00:45,  3.75s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 95%|████████████████████████████████████████████████████████████████████████████▉    | 210/221 [12:26<00:42,  3.84s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 95%|█████████████████████████████████████████████████████████████████████████████▎   | 211/221 [12:31<00:40,  4.01s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 96%|█████████████████████████████████████████████████████████████████████████████▋   | 212/221 [12:34<00:35,  3.93s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 96%|██████████████████████████████████████████████████████████████████████████████   | 213/221 [12:38<00:29,  3.67s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 97%|██████████████████████████████████████████████████████████████████████████████▍  | 214/221 [12:41<00:25,  3.64s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 97%|██████████████████████████████████████████████████████████████████████████████▊  | 215/221 [12:45<00:22,  3.80s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 98%|███████████████████████████████████████████████████████████████████████████████▏ | 216/221 [12:49<00:19,  3.89s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 98%|███████████████████████████████████████████████████████████████████████████████▌ | 217/221 [12:53<00:15,  3.90s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 99%|███████████████████████████████████████████████████████████████████████████████▉ | 218/221 [12:57<00:11,  3.86s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+ 99%|████████████████████████████████████████████████████████████████████████████████▎| 219/221 [13:01<00:07,  3.86s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+100%|████████████████████████████████████████████████████████████████████████████████▋| 220/221 [13:05<00:04,  4.06s/it][INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|configuration_utils.py:438] 2022-02-28 17:21:17,260 >> Configuration saved in ./checkpoint-500/config.json        [INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|configuration_utils.py:438] 2022-02-28 17:21:17,260 >> Configuration saved in ./checkpoint-500/config.json        [INFO|trainer.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+02/28/2022 17:21:17 - INFO - datasets.metric - Removing /home/sanchit_huggingface_co/.cache/huggingface/metrics/wer/default/default_experiment-1-0.arrow
+[INFO|feature_extraction_utils.py:324] 2022-02-28 17:21:33,438 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonner.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|feature_extraction_utils.py:324] 2022-02-28 17:21:33,438 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonner.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|feature_extraction_utils.py:324] 2022-02-28 17:21:33,438 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonner.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.
+[INFO|feature_extraction_utils.py:324] 2022-02-28 17:21:33,438 >> Configuration saved in ./checkpoint-500/preprocessor_config.jsonner.py:560] 2022-02-28 17:08:05,922 >> The following columns in the evaluation set  don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`,  you can safely ignore this message.