0%| | 0/297 [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:33:38,486 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:33:41,122 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:33:43,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:33:46,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:33:49,316 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:33:52,004 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:33:54,643 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 0%|▎ | 1/297 [00:22<1:50:45, 22.45s/it] 0%|▎ | 1/297 [00:22<1:50:45, 22.45s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:33:57,672 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:00,355 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:03,041 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:05,683 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:08,251 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:10,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:13,600 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8415, 'learning_rate': 1.2000000000000002e-07, 'epoch': 0.01} [WARNING|modeling_utils.py:388] 2022-03-01 12:34:16,232 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▌ | 2/297 [00:43<1:47:16, 21.82s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:34:18,952 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:21,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:24,274 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:26,929 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:29,564 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:32,164 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:34,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:37,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|▊ | 3/297 [01:04<1:45:15, 21.48s/it] 1%|▊ | 3/297 [01:04<1:45:15, 21.48s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:34:39,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:42,458 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:45,029 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:47,573 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:50,078 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:52,680 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:55,252 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:34:57,801 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 1%|█ | 4/297 [01:25<1:42:59, 21.09s/it] 1%|█ | 4/297 [01:25<1:42:59, 21.09s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:35:00,491 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:03,029 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:05,557 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:08,117 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:10,662 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:13,181 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:15,686 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7452, 'learning_rate': 3.0000000000000004e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-01 12:35:18,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▎ | 5/297 [01:45<1:41:31, 20.86s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:35:20,932 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:23,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:26,029 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:28,499 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:30,895 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:33,340 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:35,798 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:38,199 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▋ | 6/297 [02:05<1:39:33, 20.53s/it] 2%|█▋ | 6/297 [02:05<1:39:33, 20.53s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:35:40,686 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:43,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:45,478 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:47,814 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:50,201 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:52,610 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:35:55,006 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8603, 'learning_rate': 3.6e-07, 'epoch': 0.02} [WARNING|modeling_utils.py:388] 2022-03-01 12:35:57,385 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 2%|█▉ | 7/297 [02:24<1:37:11, 20.11s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:35:59,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:02,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:04,728 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:07,091 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:09,572 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:12,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:14,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:16,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▏ | 8/297 [02:44<1:35:46, 19.89s/it] 3%|██▏ | 8/297 [02:44<1:35:46, 19.89s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:36:19,369 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:21,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:24,133 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:26,496 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:28,906 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:31,347 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:33,726 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8445, 'learning_rate': 4.800000000000001e-07, 'epoch': 0.03} [WARNING|modeling_utils.py:388] 2022-03-01 12:36:36,099 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▍ | 9/297 [03:03<1:34:35, 19.71s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:36:38,734 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:41,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:43,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:46,161 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:48,668 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:51,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:53,608 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:36:56,055 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 3%|██▋ | 10/297 [03:23<1:34:37, 19.78s/it] 3%|██▋ | 10/297 [03:23<1:34:37, 19.78s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:36:58,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:01,057 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:03,456 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:05,905 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:08,356 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:10,816 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:13,217 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:15,673 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|██▉ | 11/297 [03:43<1:34:03, 19.73s/it] 4%|██▉ | 11/297 [03:43<1:34:03, 19.73s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:37:18,136 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:20,459 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:22,835 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:25,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:27,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:30,195 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:32,571 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.8768, 'learning_rate': 6.599999999999999e-07, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-01 12:37:34,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▏ | 12/297 [04:02<1:33:07, 19.61s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:37:37,495 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:39,869 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:42,311 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:45,291 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:47,666 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:50,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:37:52,531 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7908, 'learning_rate': 7.2e-07, 'epoch': 0.04} [WARNING|modeling_utils.py:388] 2022-03-01 12:37:54,986 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▌ | 13/297 [04:22<1:33:21, 19.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 4%|███▌ | 13/297 [04:22<1:33:21, 19.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:02,466 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:02,466 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:07,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:07,235 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:12,077 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 14/297 [04:42<1:32:46, 19.67s/it]g-point operations will not be computed-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 14/297 [04:42<1:32:46, 19.67s/it]g-point operations will not be computed-01 12:37:57,619 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 14/297 [04:42<1:32:46, 19.67s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|███▊ | 14/297 [04:42<1:32:46, 19.67s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:21,759 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:21,759 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:26,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:26,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:31,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:31,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 15/297 [05:01<1:31:48, 19.54s/it]g-point operations will not be computed-01 12:38:17,001 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 15/297 [05:01<1:31:48, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████ | 15/297 [05:01<1:31:48, 19.54s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:40,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:40,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:45,822 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:45,822 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:38:50,466 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 16/297 [05:20<1:30:50, 19.40s/it]g-point operations will not be computed-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 16/297 [05:20<1:30:50, 19.40s/it]g-point operations will not be computed-01 12:38:36,268 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 16/297 [05:20<1:30:50, 19.40s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 5%|████▎ | 16/297 [05:20<1:30:50, 19.40s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:00,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:00,017 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:04,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:04,768 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:09,497 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 17/297 [05:39<1:30:00, 19.29s/it]g-point operations will not be computed-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 17/297 [05:39<1:30:00, 19.29s/it]g-point operations will not be computed-01 12:38:55,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 17/297 [05:39<1:30:00, 19.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▌ | 17/297 [05:39<1:30:00, 19.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:19,021 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:19,021 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:23,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:23,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:28,364 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 18/297 [05:58<1:29:05, 19.16s/it]g-point operations will not be computed-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 18/297 [05:58<1:29:05, 19.16s/it]g-point operations will not be computed-01 12:39:14,319 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 18/297 [05:58<1:29:05, 19.16s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|████▊ | 18/297 [05:58<1:29:05, 19.16s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:37,779 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:37,779 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:42,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:42,391 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:46,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 12:39:33,148 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 19/297 [06:16<1:27:39, 18.92s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 6%|█████ | 19/297 [06:16<1:27:39, 18.92s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:55,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:39:55,965 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:00,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:00,392 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:04,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:04,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 20/297 [06:34<1:26:05, 18.65s/it]g-point operations will not be computed-01 12:39:51,474 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 20/297 [06:34<1:26:05, 18.65s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▍ | 20/297 [06:34<1:26:05, 18.65s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:13,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:13,881 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:18,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:18,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:22,801 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 21/297 [06:52<1:24:49, 18.44s/it]g-point operations will not be computed-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 21/297 [06:52<1:24:49, 18.44s/it]g-point operations will not be computed-01 12:40:09,428 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 21/297 [06:52<1:24:49, 18.44s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▋ | 21/297 [06:52<1:24:49, 18.44s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:31,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:31,807 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:36,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:36,337 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:40,885 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 22/297 [07:10<1:24:02, 18.34s/it]g-point operations will not be computed-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 22/297 [07:10<1:24:02, 18.34s/it]g-point operations will not be computed-01 12:40:27,323 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 22/297 [07:10<1:24:02, 18.34s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 7%|█████▉ | 22/297 [07:10<1:24:02, 18.34s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:50,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:54,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:54,590 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:40:59,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 23/297 [07:29<1:23:37, 18.31s/it]g-point operations will not be computed-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 23/297 [07:29<1:23:37, 18.31s/it]g-point operations will not be computed-01 12:40:45,567 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 23/297 [07:29<1:23:37, 18.31s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▏ | 23/297 [07:29<1:23:37, 18.31s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:08,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:08,306 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:12,821 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:12,821 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:17,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:17,370 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 24/297 [07:47<1:23:13, 18.29s/it]g-point operations will not be computed-01 12:41:03,760 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 24/297 [07:47<1:23:13, 18.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▍ | 24/297 [07:47<1:23:13, 18.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:26,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:26,439 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:30,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:30,937 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:35,343 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:35,343 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 25/297 [08:05<1:23:15, 18.37s/it]g-point operations will not be computed-01 12:41:22,038 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 25/297 [08:05<1:23:15, 18.37s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 8%|██████▋ | 25/297 [08:05<1:23:15, 18.37s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:45,076 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:45,076 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:45,076 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:45,076 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:41:45,076 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.9185, 'learning_rate': 1.5e-06, 'epoch': 0.09} 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████ | 26/297 [08:23<1:22:28, 18.26s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.654, 'learning_rate': 1.5599999999999999e-06, 'epoch': 0.09} 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6688, 'learning_rate': 1.62e-06, 'epoch': 0.09} 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 9%|███████▎ | 27/297 [08:41<1:21:33, 18.12s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|███████▊ | 29/297 [09:16<1:19:43, 17.85s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7119, 'learning_rate': 1.74e-06, 'epoch': 0.1} 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████ | 30/297 [09:34<1:18:38, 17.67s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6957, 'learning_rate': 1.8e-06, 'epoch': 0.1} 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 10%|████████▎ | 31/297 [09:51<1:17:30, 17.48s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▌ | 32/297 [10:07<1:15:57, 17.20s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|████████▉ | 33/297 [10:23<1:14:06, 16.84s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 34/297 [10:39<1:11:59, 16.43s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 34/297 [10:39<1:11:59, 16.43s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 34/297 [10:39<1:11:59, 16.43s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 34/297 [10:39<1:11:59, 16.43s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 34/297 [10:39<1:11:59, 16.43s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 11%|█████████▏ | 34/297 [10:39<1:11:59, 16.43s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:25,090 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6201, 'learning_rate': 2.0400000000000004e-06, 'epoch': 0.12} 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▍ | 35/297 [10:54<1:10:36, 16.17s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 36/297 [11:09<1:09:06, 15.89s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 36/297 [11:09<1:09:06, 15.89s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 36/297 [11:09<1:09:06, 15.89s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 36/297 [11:09<1:09:06, 15.89s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 36/297 [11:09<1:09:06, 15.89s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 36/297 [11:09<1:09:06, 15.89s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 12%|█████████▋ | 36/297 [11:09<1:09:06, 15.89s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6958, 'learning_rate': 2.16e-06, 'epoch': 0.12} [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:44:57,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 38/297 [11:40<1:07:35, 15.66s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 38/297 [11:40<1:07:35, 15.66s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6541, 'learning_rate': 2.22e-06, 'epoch': 0.13} 13%|██████████▏ | 38/297 [11:40<1:07:35, 15.66s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 38/297 [11:40<1:07:35, 15.66s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▏ | 38/297 [11:40<1:07:35, 15.66s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:24,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:24,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 39/297 [11:55<1:05:52, 15.32s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 39/297 [11:55<1:05:52, 15.32s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.7043, 'learning_rate': 2.28e-06, 'epoch': 0.13} 13%|██████████▌ | 39/297 [11:55<1:05:52, 15.32s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 39/297 [11:55<1:05:52, 15.32s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▌ | 39/297 [11:55<1:05:52, 15.32s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:38,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:38,130 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▊ | 40/297 [12:09<1:03:43, 14.88s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 13%|██████████▊ | 40/297 [12:09<1:03:43, 14.88s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.5367, 'learning_rate': 2.34e-06, 'epoch': 0.13} 13%|██████████▊ | 40/297 [12:09<1:03:43, 14.88s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:48,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:48,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:48,179 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:54,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:45:54,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6178, 'learning_rate': 2.4000000000000003e-06, 'epoch': 0.14} [WARNING|modeling_utils.py:388] 2022-03-01 12:45:54,550 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:00,856 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:00,856 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:00,856 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:06,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:06,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6936, 'learning_rate': 2.46e-06, 'epoch': 0.14} [WARNING|modeling_utils.py:388] 2022-03-01 12:46:06,832 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:12,691 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:12,691 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:16,787 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▊ | 43/297 [12:45<54:57, 12.98s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 14%|███████████▊ | 43/297 [12:45<54:57, 12.98s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:20,829 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:23,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:23,421 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:27,190 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▏ | 44/297 [12:56<51:20, 12.18s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▏ | 44/297 [12:56<51:20, 12.18s/it]g-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:30,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:30,935 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:34,433 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:36,732 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:41:40,586 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▍ | 45/297 [13:05<47:41, 11.36s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:46:39,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▍ | 45/297 [13:05<47:41, 11.36s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:46:39,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:41,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:39,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:43,349 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:39,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:45,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:39,080 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▋ | 46/297 [13:13<43:53, 10.49s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:46:47,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 15%|████████████▋ | 46/297 [13:13<43:53, 10.49s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:46:47,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:49,341 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:47,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:51,202 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:47,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:52,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:47,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:52,975 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:47,412 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|████████████▉ | 47/297 [13:21<40:00, 9.60s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:46:54,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:56,532 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:54,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:46:58,218 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:46:54,830 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▎ | 48/297 [13:28<36:14, 8.73s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:47:01,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▎ | 48/297 [13:28<36:14, 8.73s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:47:01,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:02,883 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:01,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:05,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:01,432 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▌ | 49/297 [13:33<32:23, 7.84s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:47:07,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 16%|█████████████▌ | 49/297 [13:33<32:23, 7.84s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:47:07,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:09,465 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:07,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 50/297 [13:39<29:17, 7.12s/it]g-point operations will not be computed-01 12:47:07,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 50/297 [13:39<29:17, 7.12s/it]g-point operations will not be computed-01 12:47:07,053 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.706, 'learning_rate': 2.9400000000000002e-06, 'epoch': 0.17} 17%|█████████████▊ | 50/297 [13:39<29:17, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|█████████████▊ | 50/297 [13:39<29:17, 7.12s/it][WARNING|modeling_utils.py:388] 2022-03-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:20,196 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:20,196 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:25,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:25,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:25,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:47:25,511 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 17%|██████████████ | 51/297 [14:01<47:08, 11.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 52/297 [14:21<58:21, 14.29s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▎ | 53/297 [14:41<1:05:00, 15.99s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3327, 'learning_rate': 3.18e-06, 'epoch': 0.18} 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 18%|██████████████▌ | 54/297 [15:01<1:09:17, 17.11s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.269, 'learning_rate': 3.24e-06, 'epoch': 0.18} 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|██████████████▊ | 55/297 [15:21<1:12:45, 18.04s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2999, 'learning_rate': 3.3e-06, 'epoch': 0.19} 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████ | 56/297 [15:42<1:15:13, 18.73s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 19%|███████████████▎ | 57/297 [16:02<1:16:29, 19.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4309, 'learning_rate': 3.4200000000000003e-06, 'epoch': 0.2} 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▌ | 58/297 [16:22<1:17:14, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3935, 'learning_rate': 3.48e-06, 'epoch': 0.2} 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|███████████████▉ | 59/297 [16:41<1:17:22, 19.51s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 20%|████████████████▏ | 60/297 [17:01<1:17:28, 19.61s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3995, 'learning_rate': 3.6e-06, 'epoch': 0.21} 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▍ | 61/297 [17:21<1:17:04, 19.59s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▋ | 62/297 [17:40<1:16:27, 19.52s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4125, 'learning_rate': 3.72e-06, 'epoch': 0.21} 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 21%|████████████████▉ | 63/297 [18:00<1:16:39, 19.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3521, 'learning_rate': 3.7800000000000002e-06, 'epoch': 0.22} 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▏ | 64/297 [18:19<1:15:18, 19.39s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 65/297 [18:37<1:13:45, 19.08s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▌ | 65/297 [18:37<1:13:45, 19.08s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:52:14,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:52:14,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:52:14,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:52:14,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:52:14,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:52:14,790 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.406, 'learning_rate': 3.9e-06, 'epoch': 0.22} 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 22%|█████████████████▊ | 66/297 [18:56<1:12:40, 18.88s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3524, 'learning_rate': 3.96e-06, 'epoch': 0.23} 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████ | 67/297 [19:14<1:12:02, 18.79s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▎ | 68/297 [19:33<1:11:57, 18.85s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 23%|██████████████████▌ | 69/297 [19:52<1:11:31, 18.82s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 70/297 [20:11<1:11:08, 18.80s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 70/297 [20:11<1:11:08, 18.80s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 70/297 [20:11<1:11:08, 18.80s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 70/297 [20:11<1:11:08, 18.80s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 70/297 [20:11<1:11:08, 18.80s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|██████████████████▊ | 70/297 [20:11<1:11:08, 18.80s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:53:59,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:53:59,871 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████ | 71/297 [20:29<1:10:23, 18.69s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 24%|███████████████████▍ | 72/297 [20:48<1:09:49, 18.62s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3188, 'learning_rate': 4.32e-06, 'epoch': 0.25} 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▋ | 73/297 [21:06<1:09:02, 18.49s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▉ | 74/297 [21:24<1:08:14, 18.36s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▉ | 74/297 [21:24<1:08:14, 18.36s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3084, 'learning_rate': 4.3799999999999996e-06, 'epoch': 0.25} 25%|███████████████████▉ | 74/297 [21:24<1:08:14, 18.36s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|███████████████████▉ | 74/297 [21:24<1:08:14, 18.36s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:55:08,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:55:08,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:55:08,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:55:08,036 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 25%|████████████████████▏ | 75/297 [21:42<1:08:04, 18.40s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▍ | 76/297 [22:00<1:07:09, 18.23s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|████████████████████▋ | 77/297 [22:17<1:05:25, 17.84s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3103, 'learning_rate': 4.62e-06, 'epoch': 0.26} 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 26%|█████████████████████ | 78/297 [22:34<1:04:01, 17.54s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1937, 'learning_rate': 4.68e-06, 'epoch': 0.27} 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▎ | 79/297 [22:51<1:02:58, 17.33s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 27%|█████████████████████▌ | 80/297 [23:08<1:02:26, 17.27s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3016, 'learning_rate': 4.800000000000001e-06, 'epoch': 0.27} [WARNING|modeling_utils.py:388] 2022-03-01 12:57:02,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:57:02,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:57:02,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:57:02,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:57:02,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:57:02,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3088, 'learning_rate': 4.86e-06, 'epoch': 0.28} 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████ | 82/297 [23:42<1:01:19, 17.12s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|██████████████████████▎ | 83/297 [23:58<1:00:19, 16.91s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 28%|███████████████████████▏ | 84/297 [24:15<59:10, 16.67s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▍ | 85/297 [24:30<58:03, 16.43s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▋ | 86/297 [24:46<57:01, 16.21s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▋ | 86/297 [24:46<57:01, 16.21s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▋ | 86/297 [24:46<57:01, 16.21s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▋ | 86/297 [24:46<57:01, 16.21s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▋ | 86/297 [24:46<57:01, 16.21s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|███████████████████████▋ | 86/297 [24:46<57:01, 16.21s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:58:32,403 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3402, 'learning_rate': 5.16e-06, 'epoch': 0.29} 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 29%|████████████████████████ | 87/297 [25:01<55:45, 15.93s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 88/297 [25:17<54:56, 15.77s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 88/297 [25:17<54:56, 15.77s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 88/297 [25:17<54:56, 15.77s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 88/297 [25:17<54:56, 15.77s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▎ | 88/297 [25:17<54:56, 15.77s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:00,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:00,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:00,594 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 89/297 [25:31<53:18, 15.38s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 89/297 [25:31<53:18, 15.38s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 89/297 [25:31<53:18, 15.38s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▌ | 89/297 [25:31<53:18, 15.38s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:12,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:12,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:12,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:12,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 30%|████████████████████████▊ | 90/297 [25:45<51:33, 14.94s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:21,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:21,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:21,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:21,463 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:29,454 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████ | 91/297 [25:58<49:09, 14.32s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████ | 91/297 [25:58<49:09, 14.32s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4333, 'learning_rate': 5.4e-06, 'epoch': 0.31} [WARNING|modeling_utils.py:388] 2022-03-01 12:59:35,387 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:35,387 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:39,657 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:39,657 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:39,657 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▍ | 92/297 [26:10<46:07, 13.50s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:45,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:45,360 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:49,402 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:51,955 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▋ | 93/297 [26:20<43:03, 12.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 31%|█████████████████████████▋ | 93/297 [26:20<43:03, 12.66s/it]g-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:55,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:55,864 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 12:59:59,476 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:01,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:01,851 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 12:47:14,758 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 94/297 [26:30<39:53, 11.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 32%|█████████████████████████▉ | 94/297 [26:30<39:53, 11.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:07,628 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:09,870 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:12,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:12,097 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:14,308 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:16,331 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:18,295 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:20,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:20,239 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:22,225 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:24,049 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:25,800 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:27,468 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:27,468 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:29,226 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:32,284 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:33,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:33,702 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:36,533 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:37,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:37,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:40,367 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:42,562 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:44,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:44,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.6683, 'learning_rate': 5.940000000000001e-06, 'epoch': 0.34} [WARNING|modeling_utils.py:388] 2022-03-01 13:00:50,010 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:50,010 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:55,366 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:00:55,366 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:00,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:00,629 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2246, 'learning_rate': 6e-06, 'epoch': 0.34} [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:01:05,903 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2737, 'learning_rate': 6.0600000000000004e-06, 'epoch': 0.34} 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 34%|███████████████████████████▊ | 102/297 [27:54<46:00, 14.16s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████ | 103/297 [28:15<52:05, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▎ | 104/297 [28:35<55:54, 17.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 35%|████████████████████████████▋ | 105/297 [28:55<58:21, 18.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2148, 'learning_rate': 6.3e-06, 'epoch': 0.36} 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▉ | 106/297 [29:15<59:52, 18.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1611, 'learning_rate': 6.36e-06, 'epoch': 0.36} 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▍ | 107/297 [29:35<1:00:34, 19.13s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2767, 'learning_rate': 6.42e-06, 'epoch': 0.36} 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2806, 'learning_rate': 6.48e-06, 'epoch': 0.37} 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 36%|████████████████████████████▋ | 108/297 [29:55<1:01:03, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2561, 'learning_rate': 6.54e-06, 'epoch': 0.37} 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▎ | 110/297 [30:35<1:01:02, 19.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3238, 'learning_rate': 6.6e-06, 'epoch': 0.37} 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 37%|█████████████████████████████▌ | 111/297 [30:54<1:00:32, 19.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2542, 'learning_rate': 6.660000000000001e-06, 'epoch': 0.38} 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▌ | 112/297 [31:13<59:57, 19.44s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 113/297 [31:33<59:25, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|██████████████████████████████▊ | 113/297 [31:33<59:25, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1891, 'learning_rate': 6.72e-06, 'epoch': 0.38} 38%|██████████████████████████████▊ | 113/297 [31:33<59:25, 19.38s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:05:14,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:05:14,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:05:14,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:05:14,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:05:14,902 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 38%|███████████████████████████████ | 114/297 [31:51<58:16, 19.10s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▎ | 115/297 [32:10<57:22, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2241, 'learning_rate': 6.900000000000001e-06, 'epoch': 0.39} 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▋ | 116/297 [32:29<57:02, 18.91s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1748, 'learning_rate': 6.96e-06, 'epoch': 0.39} 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 39%|███████████████████████████████▉ | 117/297 [32:47<56:35, 18.86s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2224, 'learning_rate': 7.0200000000000006e-06, 'epoch': 0.4} 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▏ | 118/297 [33:06<56:09, 18.82s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▍ | 119/297 [33:25<55:40, 18.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2724, 'learning_rate': 7.14e-06, 'epoch': 0.4} 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 40%|████████████████████████████████▋ | 120/297 [33:43<55:07, 18.69s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3434, 'learning_rate': 7.26e-06, 'epoch': 0.41} 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████ | 121/297 [34:02<54:37, 18.62s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 41%|█████████████████████████████████▌ | 123/297 [34:38<53:26, 18.43s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|█████████████████████████████████▊ | 124/297 [34:56<52:43, 18.29s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████ | 125/297 [35:14<52:22, 18.27s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2669, 'learning_rate': 7.5e-06, 'epoch': 0.42} 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 42%|██████████████████████████████████▎ | 126/297 [35:32<51:37, 18.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3126, 'learning_rate': 7.5600000000000005e-06, 'epoch': 0.43} 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▋ | 127/297 [35:50<50:55, 17.97s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.168, 'learning_rate': 7.62e-06, 'epoch': 0.43} 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|██████████████████████████████████▉ | 128/297 [36:07<50:10, 17.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3161, 'learning_rate': 7.680000000000001e-06, 'epoch': 0.43} 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 43%|███████████████████████████████████▏ | 129/297 [36:24<49:15, 17.59s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4017, 'learning_rate': 7.74e-06, 'epoch': 0.44} 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▍ | 130/297 [36:41<47:58, 17.24s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3215, 'learning_rate': 7.8e-06, 'epoch': 0.44} 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|███████████████████████████████████▋ | 131/297 [36:57<46:49, 16.92s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 44%|████████████████████████████████████ | 132/297 [37:13<46:06, 16.77s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▎ | 133/297 [37:30<45:39, 16.70s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▌ | 134/297 [37:46<44:54, 16.53s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 45%|████████████████████████████████████▊ | 135/297 [38:02<44:04, 16.33s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 136/297 [38:17<43:12, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 136/297 [38:17<43:12, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 136/297 [38:17<43:12, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████ | 136/297 [38:17<43:12, 16.11s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:11:59,727 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:11:59,727 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:11:59,727 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:11:59,727 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▎ | 137/297 [38:32<42:09, 15.81s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▋ | 138/297 [38:48<41:31, 15.67s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▋ | 138/297 [38:48<41:31, 15.67s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 46%|█████████████████████████████████████▋ | 138/297 [38:48<41:31, 15.67s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:28,101 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:28,101 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:28,101 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:28,101 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:28,101 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 139/297 [39:02<40:20, 15.32s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|█████████████████████████████████████▉ | 139/297 [39:02<40:20, 15.32s/it]g-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:40,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:40,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:40,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:40,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:40,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:12:40,471 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:00:04,241 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 140/297 [39:16<38:48, 14.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 140/297 [39:16<38:48, 14.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 140/297 [39:16<38:48, 14.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 140/297 [39:16<38:48, 14.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▏ | 140/297 [39:16<38:48, 14.83s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:00,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:00,352 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 141/297 [39:29<37:08, 14.28s/it]g-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 47%|██████████████████████████████████████▍ | 141/297 [39:29<37:08, 14.28s/it]g-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:06,732 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:06,732 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:06,732 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:12,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:12,761 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 142/297 [39:41<35:23, 13.70s/it]g-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|██████████████████████████████████████▋ | 142/297 [39:41<35:23, 13.70s/it]g-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:18,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:18,655 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:22,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:22,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:22,845 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:12:50,647 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████ | 143/297 [39:53<33:21, 13.00s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████ | 143/297 [39:53<33:21, 13.00s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:31,114 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:31,114 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:35,047 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▎ | 144/297 [40:03<31:25, 12.32s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 48%|███████████████████████████████████████▎ | 144/297 [40:03<31:25, 12.32s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:38,995 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:38,995 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:42,572 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:44,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:44,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 49%|███████████████████████████████████████▌ | 145/297 [40:13<29:13, 11.53s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:48,429 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:50,575 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:52,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:52,687 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:54,735 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:56,855 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:13:58,804 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:00,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:00,661 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:02,444 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:04,292 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:07,588 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:09,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:09,146 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:12,147 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:13,530 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:16,218 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:16,218 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:17,438 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:20,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:20,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.642, 'learning_rate': 8.939999999999999e-06, 'epoch': 0.5} [WARNING|modeling_utils.py:388] 2022-03-01 13:14:20,233 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:25,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:25,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:31,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:31,258 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:36,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:14:36,543 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.26, 'learning_rate': 9e-06, 'epoch': 0.51} 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▏ | 151/297 [41:09<27:52, 11.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2318, 'learning_rate': 9.06e-06, 'epoch': 0.51} 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 51%|█████████████████████████████████████████▍ | 152/297 [41:29<34:00, 14.07s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|█████████████████████████████████████████▋ | 153/297 [41:49<37:50, 15.77s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1934, 'learning_rate': 9.18e-06, 'epoch': 0.52} 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████ | 154/297 [42:09<40:32, 17.01s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 52%|██████████████████████████████████████████▎ | 155/297 [42:29<42:28, 17.95s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1637, 'learning_rate': 9.36e-06, 'epoch': 0.53} 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|██████████████████████████████████████████▌ | 156/297 [42:49<43:44, 18.61s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2322, 'learning_rate': 9.42e-06, 'epoch': 0.53} 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 53%|███████████████████████████████████████████ | 158/297 [43:29<44:46, 19.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▎ | 159/297 [43:49<44:45, 19.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2042, 'learning_rate': 9.54e-06, 'epoch': 0.54} 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▋ | 160/297 [44:08<44:33, 19.51s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 54%|███████████████████████████████████████████▉ | 161/297 [44:28<44:12, 19.50s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1874, 'learning_rate': 9.66e-06, 'epoch': 0.54} 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▏ | 162/297 [44:47<43:44, 19.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2569, 'learning_rate': 9.72e-06, 'epoch': 0.55} 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▍ | 163/297 [45:07<43:41, 19.56s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1769, 'learning_rate': 9.780000000000001e-06, 'epoch': 0.55} 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 55%|████████████████████████████████████████████▋ | 164/297 [45:26<43:06, 19.45s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2435, 'learning_rate': 9.84e-06, 'epoch': 0.55} 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████ | 165/297 [45:45<42:26, 19.29s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▎ | 166/297 [46:04<41:48, 19.15s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 167/297 [46:22<40:58, 18.91s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 56%|█████████████████████████████████████████████▌ | 167/297 [46:22<40:58, 18.91s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.223, 'learning_rate': 9.960000000000001e-06, 'epoch': 0.56} 56%|█████████████████████████████████████████████▌ | 167/297 [46:22<40:58, 18.91s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:20:04,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:20:04,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:20:04,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:20:04,253 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2185, 'learning_rate': 1.002e-05, 'epoch': 0.56} 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|█████████████████████████████████████████████▊ | 168/297 [46:40<40:02, 18.63s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1909, 'learning_rate': 1.008e-05, 'epoch': 0.57} 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████ | 169/297 [46:58<39:20, 18.44s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1373, 'learning_rate': 1.0140000000000001e-05, 'epoch': 0.57} 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 57%|██████████████████████████████████████████████▎ | 170/297 [47:17<39:04, 18.46s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1232, 'learning_rate': 1.02e-05, 'epoch': 0.58} 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▋ | 171/297 [47:35<38:39, 18.41s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2279, 'learning_rate': 1.0260000000000002e-05, 'epoch': 0.58} 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|██████████████████████████████████████████████▉ | 172/297 [47:53<38:11, 18.33s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0904, 'learning_rate': 1.032e-05, 'epoch': 0.58} 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 58%|███████████████████████████████████████████████▏ | 173/297 [48:11<37:46, 18.28s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1484, 'learning_rate': 1.0379999999999999e-05, 'epoch': 0.59} 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▍ | 174/297 [48:29<37:15, 18.17s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|███████████████████████████████████████████████▋ | 175/297 [48:48<36:57, 18.18s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 176/297 [49:05<36:22, 18.04s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 176/297 [49:05<36:22, 18.04s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1508, 'learning_rate': 1.05e-05, 'epoch': 0.59} 59%|████████████████████████████████████████████████ | 176/297 [49:05<36:22, 18.04s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 176/297 [49:05<36:22, 18.04s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 176/297 [49:05<36:22, 18.04s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 59%|████████████████████████████████████████████████ | 176/297 [49:05<36:22, 18.04s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:22:53,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:22:53,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▎ | 177/297 [49:23<35:48, 17.90s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▌ | 178/297 [49:40<35:13, 17.76s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 179/297 [49:57<34:36, 17.60s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 179/297 [49:57<34:36, 17.60s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1837, 'learning_rate': 1.068e-05, 'epoch': 0.6} 60%|████████████████████████████████████████████████▊ | 179/297 [49:57<34:36, 17.60s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 179/297 [49:57<34:36, 17.60s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 179/297 [49:57<34:36, 17.60s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 179/297 [49:57<34:36, 17.60s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 60%|████████████████████████████████████████████████▊ | 179/297 [49:57<34:36, 17.60s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:23:47,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:23:47,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3284, 'learning_rate': 1.074e-05, 'epoch': 0.61} [WARNING|modeling_utils.py:388] 2022-03-01 13:23:47,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:23:47,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:23:47,379 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:00,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:00,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:00,227 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 61%|█████████████████████████████████████████████████▎ | 181/297 [50:32<33:28, 17.32s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:08,751 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:12,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:12,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:12,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:24:12,868 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1404, 'learning_rate': 1.086e-05, 'epoch': 0.61} g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1248, 'learning_rate': 1.092e-05, 'epoch': 0.62} 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|█████████████████████████████████████████████████▉ | 183/297 [51:05<32:13, 16.96s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 184/297 [51:20<31:07, 16.52s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 184/297 [51:20<31:07, 16.52s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2255, 'learning_rate': 1.098e-05, 'epoch': 0.62} 62%|██████████████████████████████████████████████████▏ | 184/297 [51:20<31:07, 16.52s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 184/297 [51:20<31:07, 16.52s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 184/297 [51:20<31:07, 16.52s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▏ | 184/297 [51:20<31:07, 16.52s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:06,467 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:06,467 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 62%|██████████████████████████████████████████████████▍ | 185/297 [51:35<30:04, 16.11s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 186/297 [51:50<29:07, 15.74s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|██████████████████████████████████████████████████▋ | 186/297 [51:50<29:07, 15.74s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:28,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:28,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:28,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:28,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:28,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:25:28,867 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████ | 187/297 [52:06<28:33, 15.57s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▎ | 188/297 [52:21<28:12, 15.53s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▎ | 188/297 [52:21<28:12, 15.53s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 63%|███████████████████████████████████████████████████▎ | 188/297 [52:21<28:12, 15.53s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:01,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:01,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:01,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:01,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:01,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 189/297 [52:35<27:22, 15.21s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|███████████████████████████████████████████████████▌ | 189/297 [52:35<27:22, 15.21s/it]g-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:13,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:13,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:13,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:13,539 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:21,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:21,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2799, 'learning_rate': 1.134e-05, 'epoch': 0.64} [WARNING|modeling_utils.py:388] 2022-03-01 13:26:21,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:21,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:29,958 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:29,958 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:29,958 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:29,958 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:13:27,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 191/297 [53:02<24:56, 14.12s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 191/297 [53:02<24:56, 14.12s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 191/297 [53:02<24:56, 14.12s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 64%|████████████████████████████████████████████████████ | 191/297 [53:02<24:56, 14.12s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:43,740 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:43,740 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:43,740 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▎ | 192/297 [53:14<23:36, 13.49s/it]g-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:49,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:49,593 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:53,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:53,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:53,693 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:26:57,657 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:00,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:00,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:04,033 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:06,449 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 194/297 [53:35<20:27, 11.92s/it]g-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 65%|████████████████████████████████████████████████████▉ | 194/297 [53:35<20:27, 11.92s/it]g-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:10,050 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:12,288 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:14,488 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:16,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:16,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3478, 'learning_rate': 1.164e-05, 'epoch': 0.66} [WARNING|modeling_utils.py:388] 2022-03-01 13:27:19,850 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:21,825 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 196/297 [53:52<17:05, 10.15s/it]g-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 196/297 [53:52<17:05, 10.15s/it]g-point operations will not be computed-01 13:26:36,207 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▍ | 196/297 [53:52<17:05, 10.15s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:27:25,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:27,574 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:25,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:29,361 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:25,764 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 197/297 [53:59<15:28, 9.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:27:32,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 66%|█████████████████████████████████████████████████████▋ | 197/297 [53:59<15:28, 9.29s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:27:32,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:34,583 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:32,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:36,164 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:32,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 198/297 [54:06<13:56, 8.45s/it]g-point operations will not be computed-01 13:27:32,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████ | 198/297 [54:06<13:56, 8.45s/it]g-point operations will not be computed-01 13:27:32,941 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:40,714 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:39,287 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:42,073 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:39,287 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▎ | 199/297 [54:11<12:23, 7.59s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:27:44,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▎ | 199/297 [54:11<12:23, 7.59s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:27:44,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:45,973 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:44,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:48,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:44,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:48,249 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:44,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 200/297 [54:17<11:10, 6.92s/it]g-point operations will not be computed-01 13:27:44,771 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 200/297 [54:17<11:10, 6.92s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 67%|██████████████████████████████████████████████████████▌ | 200/297 [54:17<11:10, 6.92s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:57,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:27:57,862 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:28:03,194 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:28:03,194 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:28:08,345 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2574, 'learning_rate': 1.2e-05, 'epoch': 0.68} 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|██████████████████████████████████████████████████████▊ | 201/297 [54:38<18:05, 11.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1977, 'learning_rate': 1.2060000000000001e-05, 'epoch': 0.68} 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████ | 202/297 [54:59<22:21, 14.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 68%|███████████████████████████████████████████████████████▎ | 203/297 [55:19<25:08, 16.05s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▋ | 204/297 [55:40<26:49, 17.31s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|███████████████████████████████████████████████████████▉ | 205/297 [56:00<27:48, 18.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1696, 'learning_rate': 1.2299999999999999e-05, 'epoch': 0.69} 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 69%|████████████████████████████████████████████████████████▏ | 206/297 [56:19<28:06, 18.54s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2188, 'learning_rate': 1.236e-05, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▍ | 207/297 [56:38<28:02, 18.70s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1895, 'learning_rate': 1.242e-05, 'epoch': 0.7} 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|████████████████████████████████████████████████████████▋ | 208/297 [56:58<28:02, 18.91s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1555, 'learning_rate': 1.2479999999999999e-05, 'epoch': 0.7} 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 70%|█████████████████████████████████████████████████████████ | 209/297 [57:17<28:02, 19.12s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2614, 'learning_rate': 1.254e-05, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▎ | 210/297 [57:37<27:54, 19.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0539, 'learning_rate': 1.26e-05, 'epoch': 0.71} 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▌ | 211/297 [57:56<27:33, 19.22s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 71%|█████████████████████████████████████████████████████████▊ | 212/297 [58:15<27:14, 19.23s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2161, 'learning_rate': 1.272e-05, 'epoch': 0.72} 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████ | 213/297 [58:35<27:08, 19.39s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▎ | 214/297 [58:54<26:41, 19.29s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2133, 'learning_rate': 1.284e-05, 'epoch': 0.72} 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 72%|██████████████████████████████████████████████████████████▋ | 215/297 [59:13<26:17, 19.24s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1991, 'learning_rate': 1.29e-05, 'epoch': 0.73} 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|██████████████████████████████████████████████████████████▉ | 216/297 [59:32<25:49, 19.13s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 217/297 [59:51<25:22, 19.03s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 217/297 [59:51<25:22, 19.03s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|███████████████████████████████████████████████████████████▏ | 217/297 [59:51<25:22, 19.03s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:33:33,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:33:33,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:33:33,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:33:33,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:33:33,027 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 73%|█████████████████████████████████████████████████████████▉ | 218/297 [1:00:09<24:53, 18.90s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1368, 'learning_rate': 1.308e-05, 'epoch': 0.74} 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▎ | 219/297 [1:00:28<24:25, 18.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▌ | 220/297 [1:00:46<23:57, 18.67s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1136, 'learning_rate': 1.32e-05, 'epoch': 0.74} 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 74%|██████████████████████████████████████████████████████████▊ | 221/297 [1:01:05<23:31, 18.57s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1069, 'learning_rate': 1.326e-05, 'epoch': 0.75} g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▎ | 223/297 [1:01:40<22:13, 18.01s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1782, 'learning_rate': 1.338e-05, 'epoch': 0.75} 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 75%|███████████████████████████████████████████████████████████▌ | 224/297 [1:01:57<21:48, 17.93s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▊ | 225/297 [1:02:16<21:41, 18.08s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▊ | 225/297 [1:02:16<21:41, 18.08s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1584, 'learning_rate': 1.344e-05, 'epoch': 0.76} 76%|███████████████████████████████████████████████████████████▊ | 225/297 [1:02:16<21:41, 18.08s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▊ | 225/297 [1:02:16<21:41, 18.08s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|███████████████████████████████████████████████████████████▊ | 225/297 [1:02:16<21:41, 18.08s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1714, 'learning_rate': 1.3500000000000001e-05, 'epoch': 0.76} [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:01,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1961, 'learning_rate': 1.356e-05, 'epoch': 0.76} 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 76%|████████████████████████████████████████████████████████████▍ | 227/297 [1:02:51<20:45, 17.79s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 228/297 [1:03:08<20:18, 17.66s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 228/297 [1:03:08<20:18, 17.66s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 228/297 [1:03:08<20:18, 17.66s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|████████████████████████████████████████████████████████████▋ | 228/297 [1:03:08<20:18, 17.66s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1515, 'learning_rate': 1.3680000000000001e-05, 'epoch': 0.77} [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:36:51,924 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 77%|█████████████████████████████████████████████████████████████▏ | 230/297 [1:03:43<19:30, 17.47s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2587, 'learning_rate': 1.3800000000000002e-05, 'epoch': 0.78} 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▍ | 231/297 [1:04:00<19:03, 17.33s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1317, 'learning_rate': 1.3860000000000001e-05, 'epoch': 0.78} 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▋ | 232/297 [1:04:16<18:35, 17.17s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 78%|█████████████████████████████████████████████████████████████▉ | 233/297 [1:04:33<18:07, 16.99s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▏ | 234/297 [1:04:49<17:36, 16.77s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▌ | 235/297 [1:05:05<17:04, 16.52s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 79%|██████████████████████████████████████████████████████████████▊ | 236/297 [1:05:21<16:34, 16.30s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 237/297 [1:05:37<16:03, 16.06s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 237/297 [1:05:37<16:03, 16.06s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 237/297 [1:05:37<16:03, 16.06s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 237/297 [1:05:37<16:03, 16.06s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 237/297 [1:05:37<16:03, 16.06s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████ | 237/297 [1:05:37<16:03, 16.06s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:23,269 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 238/297 [1:05:52<15:40, 15.94s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 238/297 [1:05:52<15:40, 15.94s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3278, 'learning_rate': 1.422e-05, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▎ | 238/297 [1:05:52<15:40, 15.94s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 238/297 [1:05:52<15:40, 15.94s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 238/297 [1:05:52<15:40, 15.94s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▎ | 238/297 [1:05:52<15:40, 15.94s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:37,558 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▌ | 239/297 [1:06:06<14:53, 15.41s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▌ | 239/297 [1:06:06<14:53, 15.41s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1576, 'learning_rate': 1.428e-05, 'epoch': 0.8} 80%|███████████████████████████████████████████████████████████████▌ | 239/297 [1:06:06<14:53, 15.41s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 80%|███████████████████████████████████████████████████████████████▌ | 239/297 [1:06:06<14:53, 15.41s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:47,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:47,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:47,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:47,806 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|███████████████████████████████████████████████████████████████▊ | 240/297 [1:06:20<14:05, 14.83s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:56,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:56,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:56,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:39:56,112 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:04,085 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 241/297 [1:06:33<13:17, 14.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████ | 241/297 [1:06:33<13:17, 14.25s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.3157, 'learning_rate': 1.44e-05, 'epoch': 0.81} [WARNING|modeling_utils.py:388] 2022-03-01 13:40:10,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:10,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:10,310 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:16,247 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 242/297 [1:06:45<12:27, 13.59s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 81%|████████████████████████████████████████████████████████████████▎ | 242/297 [1:06:45<12:27, 13.59s/it]g-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:20,705 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:20,705 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:24,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:24,927 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:28,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:28,991 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1953, 'learning_rate': 1.452e-05, 'epoch': 0.82} [WARNING|modeling_utils.py:388] 2022-03-01 13:40:32,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:32,939 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:36,685 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:39,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:39,105 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2287, 'learning_rate': 1.458e-05, 'epoch': 0.82} [WARNING|modeling_utils.py:388] 2022-03-01 13:40:42,718 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:44,996 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:47,228 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:27:52,520 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████▏ | 245/297 [1:07:15<09:42, 11.21s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:40:49,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 82%|█████████████████████████████████████████████████████████████████▏ | 245/297 [1:07:15<09:42, 11.21s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:40:49,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:51,641 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:40:49,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:53,627 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:40:49,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:55,640 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:40:49,510 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 246/297 [1:07:24<08:46, 10.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:40:57,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▍ | 246/297 [1:07:24<08:46, 10.33s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:40:57,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:40:59,536 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:40:57,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:01,399 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:40:57,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:03,176 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:40:57,642 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 247/297 [1:07:31<07:53, 9.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:41:04,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 83%|█████████████████████████████████████████████████████████████████▋ | 247/297 [1:07:31<07:53, 9.46s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:41:04,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:06,659 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:04,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:09,729 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:04,989 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|█████████████████████████████████████████████████████████████████▉ | 248/297 [1:07:38<06:58, 8.55s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:41:11,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|█████████████████████████████████████████████████████████████████▉ | 248/297 [1:07:38<06:58, 8.55s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:41:11,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:13,890 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:11,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▏ | 249/297 [1:07:43<06:03, 7.58s/it]g-point operations will not be computed-01 13:41:11,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▏ | 249/297 [1:07:43<06:03, 7.58s/it]g-point operations will not be computed-01 13:41:11,213 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:17,595 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:16,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:19,706 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:16,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 250/297 [1:07:48<05:19, 6.79s/it]g-point operations will not be computed-01 13:41:16,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 250/297 [1:07:48<05:19, 6.79s/it]g-point operations will not be computed-01 13:41:16,464 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 250/297 [1:07:48<05:19, 6.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 84%|██████████████████████████████████████████████████████████████████▍ | 250/297 [1:07:48<05:19, 6.79s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:28,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:28,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:33,976 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:33,976 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:41:38,950 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2375, 'learning_rate': 1.5e-05, 'epoch': 0.84} 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|██████████████████████████████████████████████████████████████████▊ | 251/297 [1:08:08<08:23, 10.95s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████ | 252/297 [1:08:28<10:13, 13.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.115, 'learning_rate': 1.5120000000000001e-05, 'epoch': 0.85} 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 85%|███████████████████████████████████████████████████████████████████▎ | 253/297 [1:08:48<11:20, 15.47s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▌ | 254/297 [1:09:08<11:56, 16.67s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1397, 'learning_rate': 1.524e-05, 'epoch': 0.86} 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|███████████████████████████████████████████████████████████████████▊ | 255/297 [1:09:27<12:15, 17.51s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1209, 'learning_rate': 1.53e-05, 'epoch': 0.86} 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 86%|████████████████████████████████████████████████████████████████████ | 256/297 [1:09:46<12:20, 18.07s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▎ | 257/297 [1:10:06<12:15, 18.39s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▋ | 258/297 [1:10:25<12:06, 18.62s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1725, 'learning_rate': 1.548e-05, 'epoch': 0.87} 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 87%|████████████████████████████████████████████████████████████████████▉ | 259/297 [1:10:44<11:50, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0934, 'learning_rate': 1.554e-05, 'epoch': 0.87} 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▏ | 260/297 [1:11:02<11:32, 18.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1963, 'learning_rate': 1.56e-05, 'epoch': 0.88} 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▍ | 261/297 [1:11:21<11:13, 18.72s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2262, 'learning_rate': 1.5660000000000003e-05, 'epoch': 0.88} 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 88%|█████████████████████████████████████████████████████████████████████▋ | 262/297 [1:11:40<10:53, 18.66s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|█████████████████████████████████████████████████████████████████████▉ | 263/297 [1:11:59<10:38, 18.77s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1276, 'learning_rate': 1.578e-05, 'epoch': 0.89} 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▏ | 264/297 [1:12:17<10:15, 18.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 89%|██████████████████████████████████████████████████████████████████████▍ | 265/297 [1:12:35<09:51, 18.50s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|██████████████████████████████████████████████████████████████████████▊ | 266/297 [1:12:53<09:29, 18.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████ | 267/297 [1:13:11<09:08, 18.28s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1081, 'learning_rate': 1.6020000000000002e-05, 'epoch': 0.9} 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 90%|███████████████████████████████████████████████████████████████████████▎ | 268/297 [1:13:29<08:47, 18.19s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▌ | 269/297 [1:13:47<08:26, 18.09s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2026, 'learning_rate': 1.614e-05, 'epoch': 0.91} 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|███████████████████████████████████████████████████████████████████████▊ | 270/297 [1:14:05<08:05, 18.00s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0701, 'learning_rate': 1.62e-05, 'epoch': 0.91} 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 91%|████████████████████████████████████████████████████████████████████████ | 271/297 [1:14:23<07:44, 17.88s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▎ | 272/297 [1:14:40<07:23, 17.75s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▎ | 272/297 [1:14:40<07:23, 17.75s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0502, 'learning_rate': 1.626e-05, 'epoch': 0.91} 92%|████████████████████████████████████████████████████████████████████████▎ | 272/297 [1:14:40<07:23, 17.75s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:48:21,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:48:21,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:48:21,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:48:21,689 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1113, 'learning_rate': 1.6320000000000003e-05, 'epoch': 0.92} 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▌ | 273/297 [1:14:57<07:03, 17.64s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▉ | 274/297 [1:15:14<06:41, 17.46s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 92%|████████████████████████████████████████████████████████████████████████▉ | 274/297 [1:15:14<06:41, 17.46s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0909, 'learning_rate': 1.6380000000000002e-05, 'epoch': 0.92} [WARNING|modeling_utils.py:388] 2022-03-01 13:48:53,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:48:53,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:48:53,772 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:49:02,193 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2243, 'learning_rate': 1.6440000000000002e-05, 'epoch': 0.92} 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▏ | 275/297 [1:15:32<06:23, 17.45s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1479, 'learning_rate': 1.65e-05, 'epoch': 0.93} 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▍ | 276/297 [1:15:49<06:03, 17.32s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 93%|█████████████████████████████████████████████████████████████████████████▋ | 277/297 [1:16:05<05:42, 17.11s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|█████████████████████████████████████████████████████████████████████████▉ | 278/297 [1:16:22<05:22, 16.97s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▏ | 279/297 [1:16:39<05:02, 16.81s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.0884, 'learning_rate': 1.6740000000000002e-05, 'epoch': 0.94} 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 94%|██████████████████████████████████████████████████████████████████████████▍ | 280/297 [1:16:55<04:44, 16.71s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|██████████████████████████████████████████████████████████████████████████▋ | 281/297 [1:17:11<04:24, 16.55s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████ | 282/297 [1:17:27<04:04, 16.31s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 95%|███████████████████████████████████████████████████████████████████████████▎ | 283/297 [1:17:42<03:43, 15.98s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:18,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:18,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:18,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:18,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:18,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:18,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:18,909 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▌ | 284/297 [1:17:57<03:23, 15.68s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▌ | 284/297 [1:17:57<03:23, 15.68s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▌ | 284/297 [1:17:57<03:23, 15.68s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:37,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:37,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:37,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:37,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:37,509 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▊ | 285/297 [1:18:12<03:04, 15.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▊ | 285/297 [1:18:12<03:04, 15.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▊ | 285/297 [1:18:12<03:04, 15.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|███████████████████████████████████████████████████████████████████████████▊ | 285/297 [1:18:12<03:04, 15.38s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:53,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:53,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:51:53,648 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████ | 286/297 [1:18:26<02:45, 15.02s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 96%|████████████████████████████████████████████████████████████████████████████ | 286/297 [1:18:26<02:45, 15.02s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1973, 'learning_rate': 1.71e-05, 'epoch': 0.96} 96%|████████████████████████████████████████████████████████████████████████████ | 286/297 [1:18:26<02:45, 15.02s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:05,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:05,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:05,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:05,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:05,931 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▎ | 287/297 [1:18:40<02:26, 14.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▎ | 287/297 [1:18:40<02:26, 14.65s/it]g-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:17,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:17,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:17,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:17,846 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:26,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:26,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.1242, 'learning_rate': 1.7219999999999998e-05, 'epoch': 0.97} [WARNING|modeling_utils.py:388] 2022-03-01 13:52:26,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:26,400 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:34,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:34,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:34,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:34,185 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:41:23,774 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▊ | 289/297 [1:19:06<01:49, 13.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 97%|████████████████████████████████████████████████████████████████████████████▊ | 289/297 [1:19:06<01:49, 13.72s/it][WARNING|modeling_utils.py:388] 2022-03-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:44,495 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:44,495 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:48,677 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▏ | 290/297 [1:19:17<01:31, 13.03s/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▏ | 290/297 [1:19:17<01:31, 13.03s/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:52,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:52,854 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:56,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:52:56,815 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:00,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:00,630 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.2035, 'learning_rate': 1.74e-05, 'epoch': 0.98} [WARNING|modeling_utils.py:388] 2022-03-01 13:53:04,477 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:06,884 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:09,216 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▋ | 292/297 [1:19:37<00:57, 11.53s/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed 98%|█████████████████████████████████████████████████████████████████████████████▋ | 292/297 [1:19:37<00:57, 11.53s/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:12,742 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:14,899 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:16,954 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:18,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:18,940 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:20,960 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:22,870 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:24,732 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:26,522 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:26,522 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:28,362 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:30,043 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:33,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:33,231 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:34,861 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:36,334 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:39,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:39,081 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:40,607 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [WARNING|modeling_utils.py:388] 2022-03-01 13:53:43,043 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|configuration_utils.py:438] 2022-03-01 13:53:44,281 >> Configuration saved in ./config.json20:11<00:00, 16.20s/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|configuration_utils.py:438] 2022-03-01 13:53:44,281 >> Configuration saved in ./config.json20:11<00:00, 16.20s/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed {'loss': 4.4712, 'learning_rate': 1.776e-05, 'epoch': 1.0} [INFO|configuration_utils.py:438] 2022-03-01 13:54:00,757 >> Configuration saved in ./config.jsoncessor_config.jsons/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-01 13:54:16,966 >> Configuration saved in ./preprocessor_config.jsons/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-01 13:54:16,966 >> Configuration saved in ./preprocessor_config.jsons/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed [INFO|feature_extraction_utils.py:324] 2022-03-01 13:54:16,966 >> Configuration saved in ./preprocessor_config.jsons/it]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_103527-1wkgn37c/run-1wkgn37c.wandb: 0%| | 32.0k/36.6M [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 1%|▍ | 26.4M/2.99G [00:02<03:43, 14.3MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 2%|█ | 65.4M/2.99G [00:04<02:49, 18.5MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 4%|█▋ | 109M/2.99G [00:06<02:27, 21.0MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 5%|██▍ | 154M/2.99G [00:08<02:16, 22.3MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 6%|███ | 199M/2.99G [00:10<02:11, 22.8MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 8%|███▋ | 239M/2.99G [00:12<02:15, 21.8MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 9%|████▎ | 272M/2.99G [00:14<02:29, 19.5MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 9%|████▎ | 272M/2.99G [00:14<02:29, 19.5MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file pytorch_model.bin: 9%|████▎ | 272M/2.99G [00:14<02:29, 19.5MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 100%|███████████| 34.6M/34.6M [00:18<00:00, 13.1MB/s]g-point operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file training_args.bin: 100%|███████████████████████████████████████████████████████| 3.05k/3.05k [02:30> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file training_args.bin: 100%|███████████████████████████████████████████████████████| 3.05k/3.05k [02:30> Could not estimate the number of tokens of the input, floating-point operations will not be computed 03/01/2022 13:58:23 - WARNING - huggingface_hub.repository - To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-search [INFO|modelcard.py:460] 2022-03-01 13:58:25,959 >> Dropping the following result as it does not have all the necessary fields:t operations will not be computed-01 13:52:40,187 >> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 0%| | 32.0k/34.6M [00:00> Could not estimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 39%|████▎ | 13.5M/34.6M [00:01<00:01, 14.1MB/s]To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-searchimate the number of tokens of the input, floating-point operations will not be computed Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 39%|████▎ | 13.5M/34.6M [00:01<00:01, 14.1MB/s]To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-searchimate the number of tokens of the input, floating-point operations will not be computed 03/01/2022 13:58:31 - WARNING - huggingface_hub.repository - To https://huggingface.co/sanchit-gandhi/wav2vec2-gpt2-wandb-grid-search d2da879..6218cd7 main -> main ***** train metrics ***** epoch = 1.0 train_loss = 4.3407 train_runtime = 1:20:13.58 train_samples = 28538 train_samples_per_second = 5.929 train_steps_per_second = 0.062 [INFO|trainer.py:2366] 2022-03-01 13:58:34,182 >> Num examples = 2642in the evaluation set don't have a corresponding argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. 0%| | 0/221 [00:00> Saving model checkpoint to ./ | 4/221 [00:10<10:41, 2.96s/it] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. [INFO|modeling_utils.py:1081] 2022-03-01 14:14:25,817 >> Model weights saved in ./pytorch_model.bin:10<10:41, 2.96s/it] argument in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. Upload file wandb/run-20220301_123331-3cwoccr3/run-3cwoccr3.wandb: 0%| | 32.0k/34.8M [00:00ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. return ModelInfo(**d)f.finetuned_from)formers/src/transformers/modelcard.py", line 611, in from_trainercard31, in mainule>ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message. return ModelInfo(**d)f.finetuned_from)formers/src/transformers/modelcard.py", line 611, in from_trainercard31, in mainule>ent in `SpeechEncoderDecoderModel.forward` and have been ignored: input_length. If input_length are not expected by `SpeechEncoderDecoderModel.forward`, you can safely ignore this message.